Re: Squid-SMP problems in current trunk

From: Tsantilas Christos <chtsanti_at_users.sourceforge.net>
Date: Tue, 19 Jul 2011 13:04:52 +0300

On 07/19/2011 05:51 AM, Amos Jeffries wrote:
> On Mon, 18 Jul 2011 20:03:22 +0300, Christos Tsantilas wrote:
>> On 07/18/2011 01:04 PM, Tsantilas Christos wrote:
>>> On 07/16/2011 07:11 AM, Amos Jeffries wrote:
>>>> On 16/07/11 07:43, Tsantilas Christos wrote:
>>>>> But now I am hitting the following assertion:
>>>>> assertion failed: Connection.cc:29: "fd < 0"
>>>>> The later problem looks that it has to do with file descriptors of the
>>>>> listening sockets.
>>>>
>>>> "fd < 0" indicates something is failing to call conn->close() when
>>>> abandoning an open socket.
>>>>
>>>> NP: close() is reentrant. So components can and should always close()
>>>> when they are sure the FD/socket must no longer be used.
>>>>
>>>> I'm not very certain about SMP listening sockets, which process(es) are
>>>> safe to close() on reconfigure/shutdown? the unsafe ones must do fd=-1
>>>> to abandon the FD information explicitly before the conn object
>>>> destructs.
>>>>
>>>> What situations are you hitting "fd < 0" Christos?
>>>
>>> I am hitting this assertion on kids immediately after start.
>>> Looks that the connection looses all references on its self and deleted.
>>> The socked of the connection is a listening socket.
>>
>> When a kid starting get from the parent the filedescriptors of
>> listening sockets, and creates a Comm::Connection objects for these
>> filedescriptors.
>>
>> What needed here is to assign the created Comm::Connection objects to
>> the related http_port_list object (to increase the refcount and keep
>> the connection open)
>>
>> A way to implement the above is to use the ListeningStartedDialer
>> class implemented in client_side.cc file.
>>
>> I am attaching a patch which solves this problem and allow smp-squid
>> start and serve HTTP requests, but there are similar or related bugs
>> in icp and snmp. When I am defining icp_port and snmp_port in
>> squid.conf the smp-squid does not start.
>
> In this patch the dialer has no "conn" object to assign.

There is a Ipc::StartListeningCb::conn object. The
ListeningStartedDialer is a kid class of StartListeningCb. The conn
object exists and it is valid.
The patch is working but of course is incomplete. I post it just to
point the problem.

Currently the ListeningStartedDialer dialer is the only object we have
which maps the connection (or the FD better) to a listening port.

>
> NOTE:
> clientHttpConnectionsOpen() does "s->listenConn = new
> Comm::Connection()" before starting IPC.
> When dialling post-IPC happens
> ListeningStartedDialer::portCfg->listenConn is still a ref of that object.
>
> I'm thinking:
> If OpenListenerParams adds a member to hold the listenConn created by
> clientHttpConnectionsOpen() it can be set with the response.fd in
> Ipc::SharedListenJoined().

Just take a look inside Ipc::SharedListenJoined(), in the case of SMP
just sent to the kids the local address, flags and socket type, and
return. These are informations which can be send through IPC,pipes etc.
Unfortunately the OpenListenerParams copied with memcpy in a message
send thought ICP. How can we send a refcounted object like the
listenConn through a pipe? This will cause bugs similar to the bug 3264.
We can send only the FD.

I believe that the only we can do is to add a new parameter of type
Comm::Connection to the Ipc::JoinSharedListen function to pass the
listenConn object and add it to the related PendingOpenRequest.

> We may or may not then have to do the c=new Comm::Connection() in IPC.

OK, maybe, we can just set the FD and other informations to the existing
listenConn...

>
>
> The best point to set anything coming back anyway seems to be in
> Ipc::SharedListenJoined() where the SharedlistenResponse,

It is the best, because looks that it will solve the related problems in
snmp and icp too...
But, at this time we do not have the listenConn object inside
Ipc::SharedListenJoined() and we do not have enough information to
retrieve it.
I will try to implement what I am proposing above (passing listenConn to
JoinsSharedListen...), it looks simple.

>
>
> SharedListenResponse needs to use getInt(this->fd) or similar instead of
> putPod/getPod(*this) anyway. Or at the very least give is a sub-struct
> that can be documented as raw socket bytes and get*(&this->data_).
> memcpy re-init of a whole class with methods straight from the socket
> does not seem like a good idea.

Please see the patch in the bug 3264:
    http://bugs.squid-cache.org/show_bug.cgi?id=3264
The bug described here is different to the bugs we are discussing here
but they are related. I believe just applying the patch I post here or a
similar is enough.

>
> Amos
>
Received on Tue Jul 19 2011 - 10:05:49 MDT

This archive was generated by hypermail 2.2.0 : Tue Jul 19 2011 - 12:00:03 MDT