Re: [squid-users] latency issues squid2.7 WCCP from Ryan Goddard on 2008-09-25 (squid-users)

From: Ryan Goddard <rgoddard_at_machlink.com>
Date: Thu, 25 Sep 2008 09:39:34 -0500

Thanks for the response, Adrian.
Is recompile required to change to internal DNS?
I've disabled ECN, pmtu_disc and mtu_probing.
cache_dir is as follows:
(recommended by Henrik)
> cache_dir aufs /squid0 125000 128 256
> cache_dir aufs /squid1 125000 128 256
> cache_dir aufs /squid2 125000 128 256
> cache_dir aufs /squid3 125000 128 256
> cache_dir aufs /squid4 125000 128 256
> cache_dir aufs /squid5 125000 128 256
> cache_dir aufs /squid6 125000 128 256
> cache_dir aufs /squid7 125000 128 256

No peak data available, here's some pre-peak data:
Cache Manager menu
5-MINUTE AVERAGE
sample_start_time = 1222199580.85434 (Tue, 23 Sep 2008 19:53:00 GMT)
sample_end_time = 1222199905.507274 (Tue, 23 Sep 2008 19:58:25 GMT)
client_http.requests = 268.239526/sec
client_http.hits = 111.741117/sec
client_http.errors = 0.000000/sec
IOSTAT shows lots of idle time - I'm unclear what you mean by
"profiling" ?
Also, have not tried running w/out any cache - can you explain
how this is done?

appreciate the assistance.
-Ryan

Adrian Chadd wrote:
> Firstly, you should use the internal DNS code instead of the external
> DNS helpers.
>
> Secondly, I'd do a little debugging to see if its network related -
> make sure you've disabled PMTU for example, as WCCP doesn't redirect
> the ICMP needed. Other things like Window scaling negotiation and such
> may contribute.
>
>>From a server side of things, what cache_dir config are you using?
> Whats your average/peak request rate? What about disk IO? Have you
> done any profiling? Have you tried running the proxy without any disk
> cache to see if the problem goes away?
>
> ~ terabyte of cache is quite large; I don't think any developers have
> a terabyte of storage in a box this size in a testing environment.
>
> 2008/9/24 Ryan Goddard <rgoddard_at_machlink.com>:
>> Squid 2.7.STABLE1-20080528 on Debian Linux 2.6.19.7
>> running on quad dual-core 2.6mhz Opterons with 32 gig RAM; 8x140GB disk
>> partitions
>> using WCCP L2 redirects transparently from a Cisco 4948 GigE switch
>>
>> Server has one GigE NIC for the incoming redirects and two GigE NICs for
>> outbound http requests.
>> Using IPTables to port forward HTTP to Squid; no ICP, auth, etc.; strictly a
>> web cache using heap/LFUDA replacement
>> and 16GB memory allocated with mem pools on, no limit.
>>
>> Used in an ISP environment, accommodating approx. 8k predominately cable
>> modem customers during peak.
>>
>> Issue we're experiencing is some web pages taking in excess of 20 seconds to
>> load, marked latency for customers
>> running web-based speed tests, etc.
>> Cache.log and Access.log aren't indicating any errors or timeouts; system
>> operates 96 DNS instances and 32k file descriptors
>> (neither has gotten maxed yet).
>> General Runtime Info from Cachemgr taken during pre-peak usage:
>> Start Time: Tue, 23 Sep 2008 18:07:37 GMT
>> Current Time: Tue, 23 Sep 2008 21:00:49 GMT
>>
>> Connection information for squid:
>> Number of clients accessing cache: 3382
>> Number of HTTP requests received: 2331742
>> Number of ICP messages received: 0
>> Number of ICP messages sent: 0
>> Number of queued ICP replies: 0
>> Request failure ratio: 0.00
>> Average HTTP requests per minute since start: 13463.4
>> Average ICP messages per minute since start: 0.0
>> Select loop called: 11255153 times, 0.923 ms avg
>> Cache information for squid:
>> Request Hit Ratios: 5min: 42.6%, 60min: 40.0%
>> Byte Hit Ratios: 5min: 21.2%, 60min: 18.6%
>> Request Memory Hit Ratios: 5min: 18.3%, 60min: 17.2%
>> Request Disk Hit Ratios: 5min: 33.6%, 60min: 33.3%
>> Storage Swap size: 952545580 KB
>> Storage Mem size: 8237648 KB
>> Mean Object Size: 40.43 KB
>> Requests given to unlinkd: 0
>> Median Service Times (seconds) 5 min 60 min:
>> HTTP Requests (All): 0.19742 0.12106
>> Cache Misses: 0.27332 0.17711
>> Cache Hits: 0.08265 0.03622
>> Near Hits: 0.27332 0.16775
>> Not-Modified Replies: 0.02317 0.00865
>> DNS Lookups: 0.09535 0.04854
>> ICP Queries: 0.00000 0.00000
>> Resource usage for squid:
>> UP Time: 10391.501 seconds
>> CPU Time: 4708.150 seconds
>> CPU Usage: 45.31%
>> CPU Usage, 5 minute avg: 33.29%
>> CPU Usage, 60 minute avg: 33.36%
>> Process Data Segment Size via sbrk(): 1041332 KB
>> Maximum Resident Size: 0 KB
>> Page faults with physical i/o: 4
>> Memory usage for squid via mallinfo():
>> Total space in arena: 373684 KB
>> Ordinary blocks: 372642 KB 809 blks
>> Small blocks: 0 KB 0 blks
>> Holding blocks: 216088 KB 21 blks
>> Free Small blocks: 0 KB
>> Free Ordinary blocks: 1041 KB
>> Total in use: 588730 KB 100%
>> Total free: 1041 KB 0%
>> Total size: 589772 KB
>> Memory accounted for:
>> Total accounted: 11355185 KB
>> memPoolAlloc calls: 439418241
>> memPoolFree calls: 378603777
>> File descriptor usage for squid:
>> Maximum number of file descriptors: 32000
>> Largest file desc currently in use: 9171
>> Number of file desc currently in use: 8112
>> Files queued for open: 2
>> Available number of file descriptors: 23886
>> Reserved number of file descriptors: 100
>> Store Disk files open: 175
>> IO loop method: epoll
>> Internal Data Structures:
>> 23570637 StoreEntries
>> 532260 StoreEntries with MemObjects
>> 531496 Hot Object Cache Items
>> 23561001 on-disk objects
>>
>> Generated Tue, 23 Sep 2008 21:00:47 GMT, by
>> cachemgr.cgi/2.7.STABLE1-20080528_at_proxy.machlink.com
>>
>>
>> TCPDUMP shows packets traversing all interfaces as expected; bandwidth to
>> both upstream providers isn't being maxed
>> and when Squid is shut down, http traffic loads much faster and without any
>> noticeable delay.
>>
>> Where/what else can I look at for the cause of the latency? It becomes
>> significantly worse during peak use - but as
>> we're not being choked on bandwidth and things greatly improve when I shut
>> down squid that narrows it to something
>> on the server. Is the amount of activity overloading a single squid
>> process? I'm not seeing any I/O errors in logs and haven't
>> found any evidence the kernel is under distress.
>> Any pointers are greatly appreciated.
>> thanks
>> -Ryan
>>
>>
>>
>>
>
>
Received on Thu Sep 25 2008 - 14:39:36 MDT

This archive was generated by hypermail 2.2.0 : Sat Sep 27 2008 - 12:00:03 MDT