> On Wed, 2008-09-10 at 22:58 +0200, Henrik Nordstrom wrote:
>> On ons, 2008-09-10 at 11:33 -0600, Alex Rousskov wrote:
>>
>> > Since all notifications are asynchronous, it is possible for a read
>> > or write notification that was scheduled before comm_close was
>> > called to arrive at its destination after comm_close was called.
>> > Such notification will arrive with COMM_ERR_CLOSING flag even though
>> > that flag was not set at the time of the I/O (and the I/O may have
>> > been successful). This behavior may change.
>>
>> Ugh...
>>
>> This will be a bit of a nightmare.
>
> It is not really a nightmare IMO because the order of the calls is
> preserved. From the Recipient point of view, nothing strange is going
> on. The correct fix is the removal of the COMM_ERR_CLOSING API and
> related calls, which we discuss below. I did not want to make such a
> significant change now because a smaller change should work while we are
> fixing the API itself.
>
>> Is there really any callback which uses COMM_ERR_CLOSING? I remember
>> doing an audit of this the last time we discussed this, but don't quite
>> remember if I found any...
>
> You found a few that a not just a "return" statement. They are still
> there. I did not want to remove them as a part of this cleanup, but the
> COMM_ERR_CLOSING API is scheduled to be removed (as you know) and I will
> work on that removal.
>
>> What I do know is that most callbacks looks for COMM_ERR_CLOSING and
>> immediately return without any action if set.. If you need close
>> notification you install a close handler.
>
> Exactly, but there are a few exceptions in the current code that make
> removal non-trivial.
>
>> > COMM_ERR_CLOSING interface will be removed. The read, write, and
>> > accept notifications will not be scheduled after comm_close is
>> > called. New user code should register close handlers instead.
>>
>> +1 on that.
>
> The question is whether to do it now (v3.1) or later (v3.2).
No question is, can the current code be used in production without it?
If yes, then its 3.2 issue, otherwise its blocking 3.1.
>
>> > When COMM_ERR_CLOSING interface is removed, pending notifications
>> > (if any) will be canceled after comm_close is called. However, the
>> > cancellation may be removed later if Comm is modified to provide
>> safe
>> > access to closing descriptors and their fragile state. New user code
>> > should continue to assume that it is safe to access Comm in a read,
>> > write, and accept handlers.
>>
>> Here is a slight design topic to discuss..
>>
>> Consider
>>
>> bunch of comm_write queueing stuff on the fd.
>>
>> comm_close to signal EOF.
>
>> Allowing batching of these is great for network performance, and if done
>> so then there need to be notification on the success/failure of the
>> writes..
>>
>> One could also do this by splitting comm_close in comm_eof (shutdown())
>> and then comm_close after getting the success/failure notification(s),
>> but that's a bit too much I think.
>>
>> I guess the close handler can be used for this, if extended to include a
>> "closed successfully" or "closed with I/O error" indication.
>
> If we venture into "batching" of comm_write calls with comm_close on
> top, we would need a rather different API from what we use today. I
> would not try to force that idea on the current API. The batching API
> would not care about the result of each individual operation but if
> things go wrong, it may need to cancel pending operations and notify the
> requester.
>
> I think that it would be better to finalize batching APIs after the
> current Comm API is cleaned up and all its users became a CommUser or
> similar objects, with a well-defined API encapsulating all comm-related
> logic. Once we have that interface, we can add or change things without
> worrying that the old code is going to break.
>
>> > The comm_close API will be split into "stop I/O" and "kill me" APIs.
>> > New user code should not use comm_close for the purpose of ending a
>> > job (and should not assume it does). In most cases, the job has more
>> > tasks to do even if the descriptor is closed. AsyncJob API provides
>> > job-ending tools.
>>
>> Not sure I get this..
>
> Take an HttpStateData, for example. When the socket descriptor is
> closing, HttpStateData may still have a lot of work to do. The old code
> (some of which remains) assumed that comm_close is pretty much the final
> word. After that, there is no valid state to care about and no work to
> continue. That's wrong. Does this example clarify the difference between
> "stop I/O on this descriptor" and "kill this job"?
>
>> > Raw socket descriptors may be replaced with unique IDs or small
>> > objects that help detect stale descriptor/socket usage bugs and
>> > encapsulate access to socket-specific information. New user code
>> > should treat descriptor integers as opaque objects.
>>
>> +1
>>
>> Also needed to port Squid to native Windows I/O. There is no
>> filedescriptors in win32, only objects... The socket filedescriptors
>> (and their related file filedescriptors) is a wrapper layer and what
>> limits SquidNT to 2048 filedescriptors...
>
> Hopefully, the above steps will make it easier. I have not refreshed my
> memory on overlapping I/Os and other Windows details, but I am certain
> the initial steps outlined in the notes will be required anyway.
>
> Thank you,
>
> Alex.
>
>
>
Received on Thu Sep 11 2008 - 02:05:40 MDT
This archive was generated by hypermail 2.2.0 : Thu Sep 11 2008 - 12:00:12 MDT