> My beleief is that if squid is being splitted, then it should be
> splitted in function.
> =
> * A backend store manager
> * The backend DNS daemons
> * Frontend network handlers
> * Backend handlers for things that are not well tested/integrated. (lik=
e
> ftpget in current release)
> * A supervisor, that checks that everything is OK, and restarts things
> that crash.
        I completely agree with a backend store manager.  This needs to be
kept separate and will go a long way towards improving memory usage patte=
rns
and preventing network stuff from blocking.  All I/O in this beastie need=
s
to be threaded on a call-by-call basis.  See my code in 1.2alpha for how
this is done easily.  The store backend should be the supervisor ensuring=
that all is running well.  If one its children dies, it simply restarts.
Because it's functions are well defined, it is expected that it will be l=
ess
prone to dying itself.
        The ONLY other processes you want is a farm of TCP protocol based
handlers which talk to the store backend.  This will allow a single host =
to
multi-home with multiple TCP servers (or use different ports) to get arou=
nd
FD usage limitations.  All DNS lookups should be multi-threaded to preven=
t
extra procs hanging around, and will save HEAPS on RAM.  I'll be implemen=
ting
asynch DNS lookups within the next 4 weeks.  Wait for it...
        Icidentally I've found and fixed a fairly major bug in squid 1.1
which affects FD usage.  We now regularly see 1/5 the FD usage we saw bef=
ore.
I'll send the patches off to Duane next week.
> It is probably best to let each fronend read/write the disk, and only
> use the backend store manager to find out where to read/write.
        Agreed for reading.  Writing could a little tricker to co-ordinate. You
don't want to risk multiple writers of the same URL.  It may be an idea t=
o
have the front-end write to temporary space and have the backend be
responsible to moving it to the correct location before confirming it's
in-store copy.  In fact this seems like a good idea/protocol.
> ICP is probably best placed into the backend store manager by latency
> reasons.
        Agreed.
        REVISED list below for front/back ops.
> Example on operations frontend <--> store manager
> =
> * Read-lock XXX (location of XXX returned)?
> * Done reading XXX (read-unlock request)
  * Here's a file where I just wrote XXX - Do with it as you please
        Much, much simpler.  Solves multi-writer too.  The (here's XXX) should
probably be timestamped from when the request was initiated so the backen=
d can
decide what's old data in the case of multiple writers.
> The only thing I see that will cause a great deal of headache is the
> "multiple-readers-while-fetching", where other clients than the first
> requestor might jump on a started request, but if the store manager
> keeps track of which frontend that is handling the request it is only a=
> matter of internal proxying (the additional frontends does a proxy-only=
> request to the frontend fetching XXX).
        My solution restricts piggy-backing to within the TCP front-ends alone.
The backend won't know about it (and probably shouldn't).  If the TCP =
front-ends
are serving 1000's of connections each, a little bit of double-fetch isn'=
t
going to hurt.  It's a 99.9% solution, and perhaps the easiest and thus m=
ost
efficient to implement.  What's the point of going to all the extra compl=
exity
to save perhaps .01% of your total traffic?  We want a clean solution tha=
t's
portable, not a porting/management nightmare.
> Shared memory and semaphores are best let alone if we are thinking of
> being portable... and if using a central manager then it can be avoided=
> without to much performance degration.
        Why are shared mem/semaphores needed?  If we remove locking to within
each proc, this whole mess is avoided.
> This design have the pleasant side effect of getting rid of the max fil=
e
> descriptors per process problem. Each frontend has it's own file
> descriptors (both net and disk), and it is only needed to start another=
> frontend to have a new set... (assuming that the kernel have large
> enought tables in total...)
        Again, wait for the bug fixes.  Will help heaps.
        Excellent work and ideas Henrik/Oskar.
        Cheers,
                Stew.
> ---
> Henrik Nordstr=F6m
> =
> Oskar Pearson wrote:
> > Splitting squid into multiple processes presents various problems:
> > o       You would need an "expired" (expire-dee) that will remove obj=
ects that
> >         are expired, and the individual processes will have to handle=
 the object
> >         suddenly not existing without freaking out.
> > =
> > o       You would have to do something like shared memory to know wha=
t
> >         objects are in the cache... otherwise each connection will us=
e
> >         an expensive connection to a central process to find out what=
 objects
> >         are/aren't in the cache.
> > =
> > o       There would have to have some way of saying "I am locking the=
> >         in memory-store to delete/add an object" You could do this wi=
th
> >         semaphores, I suppose... (anyone have experience in this - It=
's
> >         been described by someone I know as "not fun")
> > =
> >         Oskar
> =
> =
Received on Tue Jul 29 2003 - 13:15:41 MDT
This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:11:19 MST