Andres Kroonmaa wrote:
> Actually, I'm starting to think that async network io could have some
> benefits. Sure not like disk io, per-event, but via some sort of
> queueing mechanism. Main idea is to offload main squid thread from
> wasting time on io, leaving it free to process requests in ram. I'm of
> course thinking of SMP scaling here. My wild thought of putting poll
> on its own thread maybe even isn't that crazy idea afterall.
My grand plan involves a complete restructuring of Squid to allow for
SMP scaling, both in networking and request processing. On each CPU
there will be one thread of execution which reads and processes
requests.
> My tests show that poll() itself isn't actually a CPU hog. To reach
> 1ms overhead on my system I'd need to poll over 2600 disk FD's.
> With 1000 open files, running squid, I measured (with kernel trace)
> that typically poll returns in under 0.1ms, yet sometimes it blocked
> for several ms, seemingly randomly.
One problem is how to maintain the poll tables in userspace.
> this OS processing time is accounted to squid's system time. For this
> it appears that poll has high cpu overhead. It definitely has quite
> some overhead, I don't argue that, but I think that this overhead is
> somewhat exaggerated.
I tend to agree here, but not on the CPU usage reasoning. Kernel level
TCP/IP networking is AFAIK driven by other factors than poll.
> Generally, I believe that we cannot rely on poll to return as soon as
> some FD is ready for io.
Some times we do not actually want this. We want the FD to be ready
enought for efficient I/O.
> After we resolv poll "overhead", we'll see the same effect with
> read/write calls. For eg. after I applied my patches related to DNS
> timing defaults and increasing io sizes, my cachebox system time now
> dominates in read/write calls, and only a fraction in poll.
Good. The way it should be.
> We moved disk io to threads because each io call could block.
> It doesn't make sense to call async-io for each network io event, but
> may make sense to move a bunch of io events into queue and let a
> thread to process them in one shot.
This is what poll() does, but with a slight overhead of having to pass a
huge array of wanted events, out of which only a fraction will happen
for that poll() trip.
> I imagine that if we have a way to parse ready FD's, buildup a list of
> io ops needed and then let a thread to fulfill them, and in callbacks
> let main thread to handle the data processing, we could get more
> efficient squid on SMP.
In my view there should not be a single "main" thread of execution like
today. There are maintly two theoretical issues to solve to be able to
scale Squid on SMP:
a) The in-memory store index maintenance. Each thread of execution needs
to be aware where objects are and where to store objects.
b) Logging.
Then there are ofcourse a number practical issues in coding also..
/Henrik
Received on Tue Sep 19 2000 - 04:05:57 MDT
This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:12:37 MST