I know this isn't the most hotly debated topic of recent years :) but  
one other thing I've just thought of:
Suppose you've got a response with the following headers:
      Cache-Control: max-age=0
      Last-Modified: [10 days ago]
And suppose it matches the following refresh_pattern:
      refresh_pattern     .     0     50%     252900     ignore-no-cache
Currently this won't be cached, despite the ignore-no-cache option.
I think we should give the Last-Modified heuristic a chance in these  
circumstances (simple fix)
On 15 Jun 2006, at 23:33, Doug Dixon wrote:
> Following some IRC chat, I thought I'd start a discussion on a  
> possible improvement of refresh_pattern in Squid3.
>
> The starting point for this discussion is the fact that  
> refresh_pattern is a source of confusion for many users, even  
> expert admins. It's not obvious what it does, how to achieve  
> certain things, or under what circumstances different bits of it  
> apply or don't apply.
>
> Currently refresh_pattern means different things depending on how  
> the response freshness was calculated: whether by explicit header  
> set by the origin server (Cache-Control, Expires), by invoking the  
> Last-Modified algorithm (if it had a Last-Modified header), or  
> whether it could not calculate a freshness by either of these methods.
>
> It's quite complicated. I don't know what the right answer is.
>
> Here is an idea though:
>
> We could separate the configuration out into "standard" and "HTTP  
> violating" parts. Let us define "standard" as the two mechanisms  
> that are most semantically transparent:
>
> 1. Explicit expiration set by server (Cache-Control, Expires)
> 2. Heuristic expiration based on Last-Modified
>
> And let's define "HTTP violating" as anything that either overrides  
> these, or anything that enforces cacheability in the absence of any  
> of these headers.
>
> What configuration options do we need for each of these two  
> categories?
>
> For the "standard" configuration:
> We don't need any options for the explicit expiry mechanism, as  
> it's... explicit :)
> However, we do need a couple of global options for the Last- 
> Modified factor algorithm:
>
>      TAG: refresh_lastmod_factor (percent)
>      Default: 20
>
>      TAG: refresh_lastmod_max (minutes)
>      Default: 10080
>
> These, then, are the only refresh options I propose for a non-HTTP- 
> violating setup.
>
>
> Now for the "HTTP violating" overrides, which are more complicated.
>
> Defaults are set first:
> 	
>      TAG: refresh_override_default options
>      Default: none
>
> These can be refined by regex:
>
>      TAG: refresh_override_match [-i] pattern options
>      Default: none
>
> where options can be any of:
>      min=xxx
>           minimum amount of time this object will be considered fresh
>      max=xxx
>           maximum amount of time this object will be considered fresh
>      ignore-reload=on|off
>           ignore all client headers that prevent serving a cached  
> response
>      reload-into-ims=on|off
>           client reload is downgraded from unconditional to  
> conditional GET
>      ignore-no-cache=on|off
>           ignore all server headers that prevent caching a response
>      ignore-no-store=on|off
>           ignore "Cache-Control: no-store" server header
>      ignore-private=on|off
>           ignore "Cache-Control: private" server header
>      ignore-auth=on|off
>           cache authorized responses, even if server didn't specify  
> "Cache-Control: public"
>      refresh-ims=on|off
>           always pass client IMS requests through to the origin,  
> even if we think our copy is fresh
>
> For example:
>      refresh_override_default     max=4320 reload-into-ims=on
>
>      refresh_override_match     http://host/     ignore-reload=on  
> ignore-no-cache=on ignore-no-store=on
>      refresh_override_match     /path/     reload-into-ims=off
>      refresh_override_match     \.jpe?g$     min=1440
>      refresh_override_match     \.css$     max=60
>
>
> Main  differences in usage:
>
> 1. The overrides would always apply, regardless of how the  
> expiration time was arrived at - whether by explicit headers or  
> last-modified algorithm heuristics. Currently the Min, Max and  
> Percent settings only apply in different specific circumstances,  
> e.g. Max and Percent only apply to L-M requests, Min only applies  
> in the absence of L-M, Expires and CC max-age.
>
> 2. The refresh_override_default would always apply (although its  
> options may be overridden by those of a refresh_override_match).  
> Currently the default refresh_pattern only applies if no patterns  
> match the request, meaning you can't ever override default  
> behaviour, you can only fall back to it.
>
> 3. There is no way of setting the Last-Modified factor percentage  
> by regex! This is perhaps a big problem, and it could be added as  
> an option. But then it would be the only non-HTTP-violating  
> directive possible in the option... and so would spoil it slightly.
>
> 4. No need for global counterparts of refresh_pattern directives,  
> e.g. refresh_all_ims and reload_into_ims.
>
> 5. Frequently used override options could be stated in the default  
> instead of every subsequent line
>
>
> This may be completely the wrong way of looking at it, or it may be  
> just going too far. A smaller, but still helpful, step might be to  
> introduce a refresh_pattern_default whose values would be inherited  
> by any subsequent refresh_pattern match.
>
>
> Any help or input into this would be very welcome indeed
>
> Doug
>
>
> On 1 Jun 2006, at 20:06, Doug Dixon wrote:
>
>> Hi
>>
>> I'm fixing bug 1202 (it's a simple fix) and am cleaning up  
>> refresh.cc at the same time.
>>
>> I'd like to review the various refresh_pattern options, as some of  
>> them are mutually exclusive in practice (although you can  
>> configure all of them) and it's not clear from the documentation  
>> what they all mean. They're quite hard to understand and use  
>> correctly.
>>
>>
>> 1. reload-into-ims
>>
>> The following is legal:
>>
>> refresh_pattern     html$       5     20%     60      ignore- 
>> reload reload-into-ims
>>
>> but reload-into-ims will not have any effect. You could argue that  
>> this is obvious, but I think it should be caught at parse time.
>>
>> 2. As an aside - but I want to mention it here - we need to make  
>> it clearer that if an object does specify an expiry time, the Min,  
>> Percent and Max values in refresh_pattern will be completely  
>> ignored, but the options won't be. I'll change cf.data.pre  
>> accordingly
>>
>> 3. override-expire
>>
>> 		override-expire enforces min age even if the server
>> 		sent a Expires: header. Doing this VIOLATES the HTTP
>> 		standard.  Enabling this feature could make you liable
>> 		for problems which it causes.
>>
>> If you do want to modify the behaviour of blindly obeying the  
>> server's explicit expiry time, you can - to an extent.
>>
>> The override-expire option enforces the Min time in cache, even if  
>> the origin stated it should expire before then.
>> But it ignores the Max time (surprising!), and the L-M factor  
>> (more expected - not obvious what this would do anyway)
>>
>> It's not very intuitive. I think we should probably make this  
>> option enforce the Max time as well. Possibly even ignore the  
>> explicit expiry of the object altogether and fall back to last- 
>> modified factor??
>>
>> It could be a naming thing... override-expire doesn't really say  
>> what it does. enforce-min might be better. But then you've already  
>> stated a min and might expect it to be already enforced.
>>
>> 4. override-lastmod
>>
>> 		override-lastmod enforces min age even on objects
>> 		that were modified recently.
>>
>> The Min time isn't enforced even when the last-modified factor  
>> algorithm does kick in. If the object was only just modified and  
>> the L-M factor algorithm results in a figure lower than the Min,  
>> it will be considered fresh for less than the configured Min.
>>
>> This isn't what I would expect. I know that the override-lastmod  
>> exists to let you do this, but it's really non-intuitive. I think  
>> the Min should always be enforced if we're using L-M factor  
>> algorithm, and that we should therefore lose the override-lastmod  
>> option. Can't see the point in the default (null) behaviour of Min  
>> otherwise.
>>
>>
>> Thoughts?
>>
>> Doug
>>
>
Received on Tue Jun 20 2006 - 00:07:19 MDT
This archive was generated by hypermail pre-2.1.9 : Fri Jun 30 2006 - 12:00:02 MDT