Some Squid 1.NOVM.18 oddities (running on a Sun SPARCserver 20/151 
with Solaris 2.5, compiled with Sun's cc). Partly just to comment on various 
points I've noticed, though including some apparent bugs and places where 
there's room for improvement.
(1) A problem with (apparently) a truncated cached file at a parent server 
brought to light an oddity with cachemgr.cgi. I tried using the refresh URL
facility to force reloading of http://www.macfixit.com/ and found that
roughly 80% of the time I got "Premature end of script headers" from the
Apache server running cachemgr.cgi, and the other 20% of the time,
cachemgr.cgi just went into a CPU-bound loop with no system calls (in the
time I watched it with truss, the Solaris 2 system call tracer) - nothing in 
the cache server logs, no obviously-related entries in the file descriptor 
display as shown by an independent cachemgr request. It works for other
URLs, though, and no problems were seen refreshing the same URL using the 
Squid "client" program. 
It might be useful to know whether cachemgr behaves like that for other
people, or if maybe it only affects me for some reason. Apart from that, in
the absence of any indication of what's going wrong, I mention it "for the
record", no expectation of a solution except by pure chance if someone spots
an explanation.
[I was also surprised that Squid was caching the problem document, since it
lacked last modification and expiry timestamps, and didn't specify content
length, but I haven't looked more closely yet to decide whether I think
there's anything genuinely odd there.]
(2) Prompted by recent discussion reminding me that Squid ACLs can be 
defined in files, I had a look at whether that would be a good way to 
define potentially long lists of hosts for which requests should be routed 
direct (not via any parents) outside the main squid.conf (though it would
still need a HUP to get any changes into service). Several points arose from
that investigation:
(a) tying in with a recent query about dynamic ACLs, would it be possible 
(and reasonable) for Squid to record when any configuration files were last 
modified and to check for changes periodically (configurable, disable it if 
paranoid it might load a partially-edited file one day...), doing a 
reconfigure automatically if changes were detected? The tricky bit, I 
suppose, is that it would have to track any subsidiary files to which 
squid.conf had led it, not just check squid.conf.
(b) Enabling detailed logging showed, unsurprisingly, that with a 
substantial number of parents configured and a lot of hosts listed in the 
ACL, Squid had to check through the whole list for each parent (for which 
I'd used "cache_host acl the.host.name ! always-direct.acl"), for every
request. 
That seemed rather inefficient, so I looked around for a better option - I
was looking for an alternative to listing the relevant domains individually
with local_domain directives in squid.conf, to avoid having to edit
squid.conf just to update the list of special-case domains. 
(c) I'd noticed when running squid under truss (the Solaris 2 system call 
tracer) that it tried opening non-existent files with names corresponding to 
some DNS host or domain names appearing in the configuration file. 
Investigation showed that was for local_domain entries, though apparently 
not documented (in particular, not mentioned in the comments in the sample 
squid.conf). In contrast to acls, where filenames must be enclosed in 
quotation marks, all names mentioned in local_domain definitions are 
stat()ed and if found, scanned for hostnames. 
That seems undesirable (doubly so since it is undocumented), for several 
reasons, though the ability to use a file when identified as such (by 
quotation marks, as in an acl) would certainly be useful.
One problem is that it's not too unlikely that files may occasionally be 
created with names corresponding to hosta/domains (e.g. for a log file
extract relating to the particular system...). Having squid process such
files unexpectedly does not seem sensible.
In addition, there is a bug in the way that files named in local_domain 
directives are handled. While the directive will accept multiple names, if a
file is found and is not the last item in the list, the remaining names are
ignored. That is a consequence of the way parseLocalDomain() uses strtok()
(relying on it to keep track of its position in the configuration file line)
but then calls parseLocalDomainFile which uses strok() independently to
process the file contents, losing track of the position in the original
input line.
(d) There are, however, some features of the local_domain file handling 
which are better than the acl file handling ... lines starting with "#" or 
which are completely empty are explicitly ignored when reading a 
local_domain file, and whitespace-only lines will be ignored when parsed 
using strtok(). In contrast, if acl files contain empty or comment lines, 
they end up in the acl, as shown by detailed logging where the target 
hostname of a request is compared repeatedly to "#" or the null string, if 
the acl file contained any comments or empty lines (duplicates not ignored)!
So, in summary: 
 * if you want to specify that a lot of hosts or domains should have 
   requests routed direct, it looks as though local_domain should be 
   substantially more efficient, bypassing all parents with one test instead
   of checking every entry in an acl separately for every parent. Not
   really surprising, but the documented availability of acl files nearly
   pushed me towards using an acl file as I wanted the names out of the
   main squid.conf.
 * names mentioned in local_domain definitions will be checked to see if
   they exist as files (though that is not documented), with their contents
   interpreted as domains to exclude. Unlike acls, the names do not have
   to be enclosed in quotation marks to be interpreted that way.
 * if a file is specified with local_domain it must be the last or only
   item, as any subsequent items in the line are ignored.
 * local_domain files can safely include comment and empty lines, with
   multiple entries on each line, whereas only the first token of each line
   in an acl file is added to the acl, including "#" from comments
   and null strings for empty lines. Harmless (?) but adds to the amount
   of checking to be done for any request to which the acl applies.
I'd suggest that (if not already fixed in the 1.2 beta - I've not had time 
to look at it) the best way to tidy up these inconsistencies would be
  
 * require quotation marks around filenames in local_domain, and fix the
   use of strtok() so items after filenames are not ignored; non-quoted
   items should be used only as host/domain names. Document the (useful!)
   ability to use files for domain names.
 * fix the handling of acl files to match the handling of local_domain,
   ignoring comment and empty lines and possibly allowing multiple names
   on each non-comment line.
Plus there's the separate point about whether it would be feasible for Squid 
to detect changes to configuration files and optionally reconfigure itself 
automatically when it notices such a change. 
Since the use of files with local_domain is not currently documented, I 
suppose the question "is use of files with local_domain supported, or 
historical and obsolete, liable to be removed in future?" needs to be asked, 
as well.
                                John Line
-- University of Cambridge WWW manager account (usually John Line) Send general WWW-related enquiries to webmaster@ucs.cam.ac.ukReceived on Wed Dec 17 1997 - 15:01:27 MST
This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:37:59 MST