Hi everyone,
I am trying to track down a bug which is troubling our production 
systems and am so far stumped.
This is on Debian Linux. Tried kernels 2.4.27, and 2.6.7, squid 
2.5STABLE[157]. All have this problem.
Squid is configured as a reverse-accelerator, compiled with 
--enable-x-accelerator-vary and our webservers add X-Accelerator-Vary: 
Accept-Encoding to responses.
A small percentage of incoming requests (about 0.02%) to our 
reverse-accelerator farm take a very long time to complete. From the few 
clues I've been able to glean I suspect there is a problem with squid 
refreshing objects while another client is in the process of retrieving 
the same object.
The clues:
A wget in a loop retrieving the main page of our site will occasionally 
take just under 15 minutes to complete the retrieval. Normally it takes 
0.02 seconds.
When I look at the access.log for that retrieval and work back to the 
time the request was placed I often find that some client out on the 
internet had issued a request with a no-cache header resulting in 
TCP_CLIENT_REFRESH_MISS for the main page.
With wget --server-response I see that the Age header of the slow to 
retrieve page always has a low number of seconds, so it was just 
refreshed prior to the request.
The Age + the time to retrieve the object = the read_timeout in 
squid.conf. I changed it to 9 minutes on one server and started seeing 
wget fail with 8+ instead of 14+ minutes.
The object is transferred quickly, but the connection stays open until 
some timer in squid elapses (read_timeout) and only then squid closes 
the connection.
This problem did not exist on the same hardware with Solaris x86 as the OS.
Any ideas as to where I should be looking? There are a few places in the 
code that are ifdef'd _SQUID_LINUX_, but nothing looks applicable to the 
problem.
I am having no luck reproducing this on a test system.
-- Robert BorkowskiReceived on Mon Jan 10 2005 - 14:41:45 MST
This archive was generated by hypermail pre-2.1.9 : Mon Mar 07 2005 - 12:59:35 MST