Here's yet another design version, following many helpful suggestions
from Henrik.
Gzip Content-Encoding in Squid Design
Version Choice
The goal will be to get these changes into Squid3 HEAD.
Content-Encoding Protocol
The content-encoding protocol is describedi
Header field cases from client:
If Accept-Encoding field is present in client request
If there is a cached response aleady available, and it
contains a Content-Encoding field with encodings that are a
subset of what the client accepts
Then forward response to client unchanged
Else (no cached response with right content-encoding)
If uncoded response isn't available
Then forward client request to server/cache
If server/cache response contains Content-Encoding field
Then forward new response to client
Else (server/cache response doesn't have Content-Encoding)
Then encode client response
Send encoded response to client
Else (uncoded server response already available)
Then encode uncoded response
Send encoded response to client
Else (no Accept-Encoding in client request)
If uncoded server response already available
Forward unchanged to client
Else if coded server response already available
Then decode server response
send decoded response to client
Else (no response available yet)
Then forward request to client or cache, and behave unchanged
with respect to this protocol.
There will be no explicit links between objects that are different
links to the same coding. Instead, StoreKeys of coded objects will be
chosen particularly as MD5(OriginalStoreKey,Content-Encoding). This
would allow one to derive the StoreKeys of all possible encodings
including original if only knowing the original StoreKey and not the
requested URL.
Searching for an uncoded version of an object is done by generating an
uncoded StoreKey and looking for an object with that key. It's needed
upon cache miss (see protocol above).
Upon original or encoded object update or PURGE, delete all the
possible encoding variants. As the encodings are applied locally the
possible combinations are known and finite so there is no problem on
purging all at once. If the number of encodings grows nontrivially,
we may need to add an additional mechanism to keep that check under
control.
Original-update deletion will be triggered on swapout of a new
original object (when it gets a public key).
Etags: Encoded objects will be given unique new entity tags.
There will be a configuration option to turn off content-encoding.
Content-Encoding Implementation
New HttpHdrContCode module, that parses related HTTP headers, and
arranges for encoding or decoding appropriately. Includes the
following functions:
codeParseRequest(): Called from client_side:parseHttpRequest()
after clientStreamInit() call. Checks for and parses Allow-Encoding
headers. Instantiates content_coding appropriately, and calls
codeClientStreamInit().
codeClientStreamInit(): Adds a new node to clientStream with
codeStreamRead(), codeStreamCallback(), and codeStreamStatus() functions.
codeStreamCallback()set up encoding/decoding state depending on
combination of Content-Encoding and Allow-Encoding fields seen.
codeStreamRead(): call HttpContentCoder transformation functions
appropriately.
codeStreamStatus(): report status to stream.
New HttpContentCoder abstract type, with functions:
encodeStart()
encodeEnd()
encodeChunk()
decodeStart()
decodeEnd()
decodeChunk()
New per-coded-object ContentCoderState, to handle coding state. It'll
be referenced from the clientStream, and include fields:
HttpContentCoder *coder
off_t codedOffset
Objects will be stored both in unencoded and encoded formats. An
object will stay in the format in which Squid receives it until
requested by a client requesting a different Content-Encoding which
Squid supports (this could be immediate). Once this happens, the
object will be streamed coded into a different StoreEntry and on to
the client.
Other changes needed:
Add new content_coding field to HttpReply.
New httpHeaderGetContentEncoding(HttpReply *) function in HttpHeader.cc.
A new configuration flag to turn content-encoding off, if desired.
A new object flag, "encoded". Whenever an encoded or decoded object
is created, it's tagged as "encoded". Thus, a locally redecoded
object will be obviously so.
A new store.cc function, storeDeleteCodedCopies(), will do the
deletion of all (un)coded copies described above.
Gzip
A new GzipContentCoder module, which will be an instance of
HttpContentCoder.
Data encoding will be handled by the gzip.org <a
href=http://www.gzip.org/zlib/> zlib library</a>.
Functions:
gzEncodeStart: call inflateInit2(), write header
gzEncodeEnd: write trailer
gzEncodeChunk: call inflate()
gzDecodeStart: call deflateInit2(), read and verify header
gzDecodeEnd: verify trailer
gzDecodeChunk: call deflate()
gzDoSaveEncoded(): true
Test Strategy
Must pass the test suite.
Must add appropriate tests, including sending gzipped content to
oneself successfully.
Will also test against Apache mod_gzip implementation, and maybe even
gunzip.
Received on Tue Mar 09 2004 - 02:16:24 MST
This archive was generated by hypermail pre-2.1.9 : Thu Apr 01 2004 - 12:00:04 MST