[squid-users] trying to track down a bug from Robert Borkowski on 2005-01-10 (squid-users)

From: Robert Borkowski <rborkows@dont-contact.us>
Date: Mon, 10 Jan 2005 16:41:44 -0500

Hi everyone,

I am trying to track down a bug which is troubling our production
systems and am so far stumped.

This is on Debian Linux. Tried kernels 2.4.27, and 2.6.7, squid
2.5STABLE[157]. All have this problem.
Squid is configured as a reverse-accelerator, compiled with
--enable-x-accelerator-vary and our webservers add X-Accelerator-Vary:
Accept-Encoding to responses.

A small percentage of incoming requests (about 0.02%) to our
reverse-accelerator farm take a very long time to complete. From the few
clues I've been able to glean I suspect there is a problem with squid
refreshing objects while another client is in the process of retrieving
the same object.

The clues:
A wget in a loop retrieving the main page of our site will occasionally
take just under 15 minutes to complete the retrieval. Normally it takes
0.02 seconds.

When I look at the access.log for that retrieval and work back to the
time the request was placed I often find that some client out on the
internet had issued a request with a no-cache header resulting in
TCP_CLIENT_REFRESH_MISS for the main page.

With wget --server-response I see that the Age header of the slow to
retrieve page always has a low number of seconds, so it was just
refreshed prior to the request.

The Age + the time to retrieve the object = the read_timeout in
squid.conf. I changed it to 9 minutes on one server and started seeing
wget fail with 8+ instead of 14+ minutes.

The object is transferred quickly, but the connection stays open until
some timer in squid elapses (read_timeout) and only then squid closes
the connection.

This problem did not exist on the same hardware with Solaris x86 as the OS.

Any ideas as to where I should be looking? There are a few places in the
code that are ifdef'd _SQUID_LINUX_, but nothing looks applicable to the
problem.

I am having no luck reproducing this on a test system.

-- 
Robert Borkowski

Received on Mon Jan 10 2005 - 14:41:45 MST

This archive was generated by hypermail pre-2.1.9 : Mon Mar 07 2005 - 12:59:35 MST