Hello All,
resending it ... attachments seems to be creating some problem.
I am seeing a weird problem with 2.6-stable3 when testing with
polymix-4. During the first phase of testing the cache performs very
well, but during the second phase everything seems to break apart
drastically. I have attached the results. I can provide further data
if needed.
setup:
--------
* polymix-4
* 1000 req/s
* 2 x 6hours phases.
* 2 clt/srv pair (regular ... Intel xeon 2.8GHz with 1GB ram).
version:
Squid Cache: Version 2.6.STABLE3
configure options: '--prefix=/usr/squid' '--exec-prefix=/usr/squid'
'--sysconfdir=/usr/squid/etc' '--enable-snmp'
'--enable-err-languages=English' '--enable-linux-netfilter'
'--enable-dlmalloc' '--enable-async-io=24'
'--enable-storeio=ufs,aufs,null' '--enable-linux-tproxy'
'--enable-gnuregex' '--enable-internal-dns' '--enable-epoll'
'--with-maxfd=32768'
cache server:
* 2x Dual Core AMD Opteron(tm) Processor 270 HE
* 16GB RAM
* 1x SATA 45 GB drive.
Few observations:
---------------------------
* It looks like that after the idle phase the cache just breaks completely.
* The CPU utilization is 100% (even with epoll), during the first
phase it seems to be about 70%.
* The cache.log doesnt give any indication as to what is happening.
Except messages like these.
2006/08/25 15:32:45| squidaio_queue_request: WARNING - Queue congestion.
But these messages showed up during first phase as well and also for
2.5-S9 testing.
* 2.5 Stable-9 performs poorly ( all [hit+miss] req. mean resp. time =
4 sec, normally its abt 1.5 sec ) at 1000 req/s but it doesnt break
the way 2.6-S3 did.
* I have tried the same test for a few times and everytime after the
idle phase the problem starts. In fact, I have reduced the first phase
to 3hrs and still the same thing happens.
* vmstat shows very little free memory, but again that happens very
early in the first phase itself.
* The logs for the polyclts and polysrvs doesnt indicate anything
wrong there, except that they complain about connection reset as shown
below:
268.32| Connection.cc:485: error: 1204/1830647 (s104) Connection reset by peer
268.32| error: raw read failed
268.32| connection to 10.51.6.102:8080 failed after 1 reads, 1 writes, 1 xacts
268.35| i-top2 5015237 393.54 5733 2.36 40 4985
So I am not sure what happens during the idle-phase that everything
break so drastically.
Is it not able to recover or does it break down trying to recover ?
I would appreciate any kind of help. Any suggestions or metrics I
should look for to figure out the problem.
Thanks for your time.
-- Pranav
------------------------------
http://pd.dnsalias.org
Received on Fri Aug 25 2006 - 16:59:16 MDT
This archive was generated by hypermail pre-2.1.9 : Fri Sep 01 2006 - 12:00:02 MDT