Re: [PATCH] HTTP/1.1 response caching upgrade

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Mon, 15 Oct 2012 14:56:12 -0600

On 10/14/2012 01:17 AM, Amos Jeffries wrote:
> On prompting from bug 3670 which outlines how Auth transactions are now
> wrongly non-cacheable. I have performed a small audit of the
> HttpStateData::cacheableReply() method and this patch contains the result.
>
> trunk rev 11361 converted Cache-Control header from using a single mask
> bitmap (shared by request and response) to separate CC header objects in
> the request response. This conversion contained several regressions like
> the one bug 3670 reports.
>
> This patch (on trunk rev.12394):
> * documents HttpStateData::cacheableReply() clarifying the overall
> method action and what each individual check it doing.
> * resolves several visible regressions, including bug 3670.
> * extends the caching to handle the "no-cache" controls as per HTTP/1.1
> (MAY store, but MUST revalidate before use).
> * extends the caching for several lesser known cases of "MAY store"
> exemptions handling authenticated transactions.
>
> One side effect of now caching transactions utilizing "no-cache" is that
> hacks around Pragma:no-cache are reduced to only having any effect when
> Cache-Control is absent. Reducing their performance cost.
>
> This should give Squid a major boost to both its caching compliance and
> HIT ratios.
>
> The patch is build tested, but not yet run-time tested or HTTP/1.1
> compliance tested.

Hi Amos,

    Thank you for fixing this. I wish you did not re-indented the old
checks in the patch though -- it complicates reviewing logic changes.

> HttpReply const *rep = finalReply();
> +
> + // non-existent replies are not cacheable.
> + if (!rep)
> + return 0;

ServerStateData::finalReply() is guaranteed to return a non-nil pointer.
We have other code relying on that guarantee. There is no need to double
check IMO.

> + if (!ignoreCacheControl) {
...
> + // RFC 2068, sec 14.9.4 - MUST NOT cache any response with Authentication UNLESS certain CC controls are present
> + // allow HTTP violations to IGNORE those controls (ie re-block caching Auth)
> + if (request && (request->flags.auth || request->flags.authSent) && !REFRESH_OVERRIDE(ignore_auth)) {
> + if (!rep->cache_control)
> + return 0;

Guarding authentication check with ignoreCacheControl looks odd to me.
Should not the check be moved outside the guard, while possibly ignoring
cache_control inside the check? That is:

    // RFC ... MUST NOT cache any response with Authentication
    if (response with authentication) {
        // UNLESS certain CC controls are present
        if (!rep->cache_control || ignoreCacheControl)
            return 0;
        ... see if cache controls allow caching ...
    }

    if (!ignoreCacheControl) {
    ...

The authentication blob can be moved below the ignoreCacheControl blob
if we think authentication is, on average, more rare than conditions
tested by other CC checks. Since both blobs never return 1, it is OK to
order them as we please.

> + String s = rep->header.getList(HDR_PRAGMA);
> + const int no_cache = strListIsMember(&s, "no-cache", ',');
> + s.clean();
> +
> + if (no_cache)
> + EBIT_SET(entry->flags, ENTRY_REVALIDATE);

Consider using HttpHeader::hasListMember() instead of getList() plus
strListIsMember(). If you have to use strListIsMember(), please note
that String cleans up itself during destruction so the above can be
simplified:

    const String s = rep->header.getList(HDR_PRAGMA);
    if (strListIsMember(&s, "no-cache", ','))
        EBIT_SET(entry->flags, ENTRY_REVALIDATE);

Finally, if you are rewriting so much code anyway, consider returning a
"const char *" reason string and make the caller log it with debugs(). A
nil result would indicate a cachable response. The source code comments
are great, but it is often difficult to understand the decision when
looking at a debugging log. Polishing this may be viewed outside this
patch scope though (and I would probably not even suggest it if you kept
the old indentation for the review :-).

Thank you,

Alex.
Received on Mon Oct 15 2012 - 20:56:20 MDT

This archive was generated by hypermail 2.2.0 : Tue Oct 16 2012 - 12:00:06 MDT