Hiya,
I've been doing some work on Squid-2.6 to optimise the parser code.
I've been working with Mark Nottingham from Yahoo! who has been running various
throughput tests on Squid-2.6.
I've been concentrating on the client-side stuff - request parsing, reply
parsing/building. There's plenty of other areas of the code which could
do with some optimising but I'm focusing on the client side for now.
The summary is:
>1k responses: from 4,900 -> ~6,000-6,700
>4k responses: from 6,500 -> ~8,000-9,000
IIRC, this is measured using a local version of httperf. My changes have been:
* don't call headersEnd() so often if it can be avoided!
* quite a large rework of the request line parser (which is unfinished but
I believe parses RFC-compliant HTTP/1.0 & 1.1 requests fine; doesn't parse
HTTP/0.9 requests just for now)
* Some refactoring of clientReadRequest
* Code modifications to not triple-copy the request buffer whilst parsing
Stuff I've seen which should also give a noticable performance boost in this
particular micro benchmark:
* An overhaul of HttpReply so it doesn't double-copy the reply buffer during parsing
(Which will probably require rewriting the status line parser to not expect
a NULL terminated string, much like what the request line parser was doing..)
* Rethink the Http Header stuff - a lot of the time spent in request parsing/reply
building is the memory allocations and array manipulation needed to support
HttpHeader.
* See if there's a nice way to combine the initial header write and data buffer
into a single write(). More likely, come up with some simple way of reference
counting some stuff to build iovec's and feed to writev().
* Hint to memPoolAlloc/memPoolFree that they shouldn't xfree() certain buffers,
such as the buffers being allocated to strings and stmem buffers.
I've been profiling using gprof and perfsuite. Both are statistical; both give
different results. I've been using the gprof call graphs as well.
I'm using apachebench to do local testing. Here's what I use:
adrian@jacinta:~$ ab -c 10 -n 100000 http://192.168.3.1:3128/squid-internal-static/icons/test.4k
Squid compiled with:
adrian@kandy:~/work/squid/sf/parserwork$ env CFLAGS="-O2 -g -pg -ggdb -fno-inline-functions \
-fno-inline-functions-called-once --no-inline" ./configure --prefix="/home/adrian/work/squid/run" \
--enable-storeio="ufs null" --disable-unlinkd --quiet
Flat profile:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls ms/call ms/call name
3.57 0.67 0.67 4700715 0.00 0.00 memPoolFree
3.57 1.33 0.67 200018 0.00 0.00 headersEnd
2.73 1.84 0.51 100009 0.01 0.04 httpRequestFree
2.20 2.25 0.41 100009 0.00 0.02 parseHttpRequest
2.17 2.65 0.41 1000090 0.00 0.00 httpHeaderIdByName
2.06 3.04 0.39 5901105 0.00 0.00 arrayAppend
1.80 3.38 0.34 4701505 0.00 0.00 memPoolAlloc
1.77 3.71 0.33 1301128 0.00 0.00 xstrncpy
1.74 4.03 0.33 200018 0.00 0.02 clientWriteComplete
1.69 4.34 0.32 1300117 0.00 0.00 httpHeaderEntryDestroy
1.61 4.64 0.30 1700186 0.00 0.00 memFreeString
1.55 4.93 0.29 320996 0.00 0.06 comm_call_handlers
1.50 5.21 0.28 601188 0.00 0.00 xstrdup
1.50 5.50 0.28 500056 0.00 0.00 dlinkDelete
47 memory allocation/frees, 59 array appends, 13 header entry destroys, etc.
The same test run, compiled without -pg, run under perfmon/perfsuite:
File Summary
--------------------------------------------------------------------------------
Samples Self % Total % File
365 14.43% 14.43% /home/adrian/work/squid/sf/parserwork/src/client_side.c
353 13.96% 28.39% /home/adrian/work/squid/sf/parserwork/src/HttpHeader.c
180 7.12% 35.51% /home/adrian/work/squid/sf/parserwork/src/MemPool.c
114 4.51% 40.02% /home/adrian/work/squid/sf/parserwork/src/comm.c
98 3.88% 43.89% /home/adrian/work/squid/sf/parserwork/src/mem.c
97 3.84% 47.73% /home/adrian/work/squid/sf/parserwork/lib/Array.c
89 3.52% 51.25% /home/adrian/work/squid/sf/parserwork/src/cbdata.c
86 3.40% 54.65% /home/adrian/work/squid/sf/parserwork/lib/util.c
83 3.28% 57.93% /home/adrian/work/squid/sf/parserwork/src/tools.c
79 3.12% 61.05% /home/adrian/work/squid/sf/parserwork/src/String.c
68 2.69% 63.74% /home/adrian/work/squid/sf/parserwork/src/store_client.c
57 2.25% 65.99% /home/adrian/work/squid/sf/parserwork/src/acl.c
Function Summary
--------------------------------------------------------------------------------
Samples Self % Total % Function
126 4.98% 4.98% memPoolFree
77 3.04% 8.03% httpHeaderGetEntry
74 2.93% 10.95% arrayAppend
74 2.93% 13.88% httpHeaderClean
59 2.33% 16.21% httpHeaderEntryDestroy
56 2.21% 18.43% headersEnd
54 2.14% 20.56% memPoolAlloc
47 1.86% 22.42% clientWriteComplete
47 1.86% 24.28% httpRequestFree
46 1.82% 26.10% memFreeString
41 1.62% 27.72% comm_call_handlers
40 1.58% 29.30% stringClean
35 1.38% 30.68% clientSendMoreData
35 1.38% 32.07% xstrncpy
32 1.27% 33.33% connStateFree
30 1.19% 34.52% dlinkDelete
Function:File:Line Summary
--------------------------------------------------------------------------------
Samples Self % Total % Function:File:Line
38 1.50% 1.50% httpHeaderClean:/home/adrian/work/squid/sf/parserwork/src/HttpHeader.c:354
36 1.42% 2.93% httpHeaderEntryDestroy:/home/adrian/work/squid/sf/parserwork/src/HttpHeader.c:1193
35 1.38% 4.31% httpHeaderGetEntry:/home/adrian/work/squid/sf/parserwork/src/HttpHeader.c:555
28 1.11% 5.42% ??:??:0
26 1.03% 6.45% arrayAppend:/home/adrian/work/squid/sf/parserwork/lib/Array.c:95
25 0.99% 7.43% xstrdup:/home/adrian/work/squid/sf/parserwork/lib/util.c:600
22 0.87% 8.30% xstrncpy:/home/adrian/work/squid/sf/parserwork/lib/util.c:680
20 0.79% 9.09% httpHeaderGetEntry:/home/adrian/work/squid/sf/parserwork/src/HttpHeader.c:551
19 0.75% 9.85% memPoolFree:/home/adrian/work/squid/sf/parserwork/src/MemPool.c:326
17 0.67% 10.52% arrayAppend:/home/adrian/work/squid/sf/parserwork/lib/Array.c:93
16 0.63% 11.15% memPoolFree:/home/adrian/work/squid/sf/parserwork/src/MemPool.c:317
16 0.63% 11.78% arrayAppend:/home/adrian/work/squid/sf/parserwork/lib/Array.c:91
15 0.59% 12.38% memPoolFree:/home/adrian/work/squid/sf/parserwork/src/MemPool.c:303
15 0.59% 12.97% memPoolFree:/home/adrian/work/squid/sf/parserwork/src/MemPool.c:319
14 0.55% 13.52% headersEnd:/home/adrian/work/squid/sf/parserwork/src/mime.c:147
Each gives slightly different results but they're all centred around the same functions -
memory allocation/free, header creation/deallocation.
I'm going to stop speeding things up, complete the request/reply parser modifications and
concentrate on fixing any bugs that pop up. I'm not going to try writing an incremental
HTTP parser for now; I'll leave that for Squid-3. I'm mainly doing this to wrap my head
around what bits of the code are fast, what bits are slow, and why.
Adrian
Received on Tue Aug 29 2006 - 03:16:41 MDT
This archive was generated by hypermail pre-2.1.9 : Fri Sep 01 2006 - 12:00:03 MDT