Hi List,
As part of an internal evaluation project I have a need to maintain a
totally static copy of all Web content fetched via a squid proxy for X
URLs - i.e. every object fetched for a limit of (say) 10,000
individual web sites, 1 hour browse activity etc. This copy will form
a dataset against which we can test various web filtering and
classification products.
I've searched the squid-users archive as well as the squid-cache.org
FAQ, config guide and Wiki. As such I've seen the info at
http://www.squid-cache.org/Doc/FAQ/FAQ-12.html#ss12.23 ("How come some
objects do not get cached?");
http://www.squid-cache.org/Doc/FAQ/FAQ-12.html#ss12.20 ("How does
squid decide when to refresh a cached object?"); and "refresh_pattern"
from http://squid.visolve.com/squid/squid24s1/tuning.htm
Is there currently a supported way to force the caching of all objects
which are listed in the URLs above as non-cacheable? I believe we can
work with existing features to ensure objects are not expired out for
the duration of our tests (several months actually...) however I
cannot at this point see a way to ensure that all objects are cached
regardless of their content (e.g. all forms of Cache-Control header).
I was hoping for perhaps some form of configure arguments but they
don't appear to exist. We are happy to modify the source if need be
however I have little idea of where to start or what to look for.
If source modification _is_ the only way forward I would greatly
appreciate any pointers the list could give me regarding which source
files, functions etc I should be looking at.
Thanks list!
Regards,
Received on Thu May 18 2006 - 02:04:37 MDT
This archive was generated by hypermail pre-2.1.9 : Thu Jun 01 2006 - 12:00:02 MDT