I noticed that pages with no Last-Modified: and Expires: headers seem to
hang around for days (sometimes 2 or 3) in the cache without being refreshed
unless a client refresh is issued. I might have found the problem (some
extra debugging code was added to storeTimestampsSet, tests were done using
1.2beta22 with default refresh_pattern).
First request:
1998/07/17 23:17:39| refreshCheck: '[null_mem_obj]'
1998/07/17 23:17:39| refreshCheck: Matched '. 0 20% 259200'
1998/07/17 23:17:39| refreshCheck: age = 959
1998/07/17 23:17:39| refreshCheck: entry->timestamp = 900709300
1998/07/17 23:17:39| refreshCheck: entry->lastmod = 900704029
1998/07/17 23:17:39| refreshCheck: factor = 0.181939
1998/07/17 23:17:39| refreshCheck: NO: factor < pct
900710259.526 9 127.0.0.1 TCP_HIT/200 9552 GET http://www.planet.nl/
This is a TCP_HIT. Somewhat later a refresh is needed:
1998/07/17 23:19:46| refreshCheck: '[null_mem_obj]'
1998/07/17 23:19:46| refreshCheck: Matched '. 0 20% 259200'
1998/07/17 23:19:46| refreshCheck: age = 1086
1998/07/17 23:19:46| refreshCheck: entry->timestamp = 900709300
1998/07/17 23:19:46| refreshCheck: entry->lastmod = 900704029
1998/07/17 23:19:46| refreshCheck: factor = 0.206033
1998/07/17 23:19:46| clientProcessExpired: setting lmt = 900704029
1998/07/17 23:19:47| getMaxAge: 'http://www.planet.nl/'
1998/07/17 23:19:47| ctx: enter level 0: 'http://www.planet.nl/'
1998/07/17 23:19:47| storeTime: served_date = 900707629
1998/07/17 23:19:47| storeTime: e->expires = -1
1998/07/17 23:19:47| storeTime: e->timestamp = served_date = 900707629
1998/07/17 23:19:47| ctx: exit level 0
1998/07/17 23:19:47| storeTime: served_date = 900707629
1998/07/17 23:19:47| storeTime: e->expires = -1
1998/07/17 23:19:47| storeTime: e->timestamp = served_date = 900707629
900710387.398 537 127.0.0.1 TCP_REFRESH_HIT/200 9552 GET http://www.planet.nl/
This is a REFRESH_HIT but entry->lastmod has not been changed which you can
check with a new request:
1998/07/17 23:22:05| refreshCheck: '[null_mem_obj]'
1998/07/17 23:22:05| refreshCheck: Matched '. 0 20% 259200'
1998/07/17 23:22:05| refreshCheck: age = 138
1998/07/17 23:22:05| refreshCheck: entry->timestamp = 900710387
1998/07/17 23:22:05| refreshCheck: entry->lastmod = 900704029
1998/07/17 23:22:05| refreshCheck: factor = 0.021705
1998/07/17 23:22:05| refreshCheck: NO: factor < pct
900710525.888 9 127.0.0.1 TCP_HIT/200 9552 GET http://www.planet.nl/
When a client refresh is done entry->lastmod does get updated and is being
set to the served_date in absence of a real Last-Modified: header.
1998/07/17 23:25:35| getMaxAge: 'http://www.planet.nl/'
1998/07/17 23:25:35| ctx: enter level 0: 'http://www.planet.nl/'
1998/07/17 23:25:35| storeTime: served_date = 900710733
1998/07/17 23:25:35| storeTime: e->expires = -1
1998/07/17 23:25:35| storeTime: e->lastmod = served_date = 900710733
1998/07/17 23:25:35| storeTime: e->timestamp = served_date = 900710733
900710737.568 2891 127.0.0.1 TCP_CLIENT_REFRESH_MISS/200 9548 GET http://www.planet.nl/
As long as TCP_REFRESH_{MISS,HIT} are used the timestamp field is adjusted
but the lastmod field isn't. When I started testing this I had a copy of the
above page from July 16 but entry->lastmod was still at July 8:
1998/07/17 21:44:07| refreshCheck: '[null_mem_obj]'
1998/07/17 21:44:07| refreshCheck: Matched '. 0 20% 259200'
1998/07/17 21:44:07| refreshCheck: age = 85443
1998/07/17 21:44:07| refreshCheck: entry->timestamp = 900619204
1998/07/17 21:44:07| refreshCheck: entry->lastmod = 899927249
1998/07/17 21:44:07| refreshCheck: factor = 0.123481
(899927249 = Wed Jul 8 21:47:29 CEST 1998)
entry->timestamp matches with the most recent TCP_REFRESH_MISS:
900619206.278 3061 127.0.0.1 TCP_REFRESH_MISS/200 9693 GET http://www.planet.nl/
(900619206 = Thu Jul 16 22:00:06 CEST 1998)
This old lastmod value has a negative influence on the factor calculation:
factor = (now - timestamp) / (timestamp - lastmod)
Shouldn't a TCP_REFRESH_{MISS,HIT} should also update entry->lastmod and set
it to e.g. served_date/squid_curtime? I tried coding it myself but the
refresh code is not that simple.
Arjan
-- Arjan de Vet, Eindhoven, The Netherlands <Arjan.deVet@adv.iae.nl> URL: http://www.iae.nl/users/devet/ for PGP key: finger devet@iae.nlReceived on Tue Jul 29 2003 - 13:15:51 MDT
This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:11:49 MST