Hi,
I spent some time tracking down (and finding) problems in squid-3
when Squid starts to run out of FDs.
Problem #1: comm_accept_check_event() scheduled incorrectly.
from fdc_t::acceptOne():
eventAdd("comm_accept_check_event", comm_accept_check_event, this,
1000.0 / (double)(accept.accept.check_delay), 1, false);
There are at least two problems here. First, the division is
wrong. It should be check_delay / 1000. Second, check_delay
is never set for the HTTP accept socket because fdc_open() is
not called for it. The value "infinity" gets passed to eventAdd().
Also, the comm_accept_check_event is cancelled by a comm_close().
Problem #2: comm_accept_check_event() and AcceptLimiter accomplish the
same thing
Squid has two ways to make sure it doesn't accept new connections
when short on FDs. One is httpAccept(), okToAccept, and the AcceptLimiter
class. The other is acceptOne(), comm_accept_check_event(),
a new connection has been accepted. The acceptOne() method also
checks FD resources and might try the comm_accept_check_event
trick *before* calling accept(2).
I don't really see the point to having both of these competing
methods in the code.
Problem #3: comm_poll.cc: assert(shutting_down)
Due to problem(s) #1, Squid will get into a state where there
is no handler for the incoming HTTP FD, and no way to get one
back. The comm_accept_check_event() will never be called either
because it is schedule at infinity, or because comm_close()
cancels it. After some time all open file descriptors get closed
and comm_poll() will assert because there are no FDs to poll on.
The assertion is that the only time there should be no FDs to
poll on is during shutdown.
Now if problem #1 is fixed, it may be very unlikely that this
condition could happen again. i.e., it is unlikely that all
sockets and files would get closed before the event happens.
But this only makes it less likely, not impossible, to happen.
Problem #4: reconfigure during deferred accept doesn't work
Both accept-deferring techniques assume that the incoming HTTP
FD does not change between the time deferring starts and ends.
If Squid is reconfigured, the FD will likely change. This should
be easy, but ugly, to fix.
It seems to me that AcceptLimiter works okay, and that
comm_accept_check_event() is a mess. Can someone justify keeping
comm_accept_check_event()?
Duane W.
Received on Fri May 26 2006 - 13:47:27 MDT
This archive was generated by hypermail pre-2.1.9 : Thu Jun 01 2006 - 12:00:04 MDT