Re: squid-2.4 release ? from Andres Kroonmaa on 2001-01-09 (squid-dev)

From: Andres Kroonmaa <andre@dont-contact.us>
Date: Tue, 9 Jan 2001 10:50:46 +0200

Hi all, and Happy New Year,

On 9 Jan 2001, at 2:38, Henrik Nordstrom <hno@hem.passagen.se> wrote:

> Adrian Chadd wrote:
>
> > You could start by changing sdirno into a byte instead of an int.
> > That chops 3 bytes (on 32 bit archs) out per StoreEntry.
>
> Done (and moved things around to account for alignment).
>
> Now the structure on 32-bit platforms looks like
> 8 hash_link hash; /* must be first */
> 4 MemObject *mem_obj;
> 4 RemovalPolicyNode repl;
> 4 time_t timestamp;
> 4 time_t lastref;
> 4 time_t expires;
> 4 time_t lastmod;

I've been thinking, why we need that much time_t data in ram all the
time. All 4 timestamps are used to determine point in time when squid
needs to revalidate object freshness. lastref used to help in LRU
logic, but since we have dlinklists and place ref'ed objects ontop,
this isn't really very much needed anymore.

If we'd partly revert back to TTL'based refresh logic, only to conserve
ram and help ICP/digests, and let current refresh logic handle actual
http fetches from cache, we might be able to further reduce memory use.

My path of thinking goes like this. We take URL at fetch time, find
those 4 timestamps, from them calculate LM_AGE and FETCH_AGE, and then
act as specified in refresh_patterns. Basically, in whatever way, we
still simply determine an exact time when object needs to revalidated.

We can determine this timepoint at a time object is stored into FS, by
the same refresh_pattern rules. We store all 4 timestamps, but in ram
only keep a single timestamp that specifies a timepoint when object
cannot be considered fresh any more. As squid currently does not purge
objects that expire by Expires: headers, there is not much difference
between Expires, max age, and lm_factor based expiring.

We can use this single timestamp when returning ICP replies and when
deciding whether add the object into digests.

During HTTP fetches, we can afford to make a disk lookup to determine
precise freshness, and we may recalculate expiry stamp for inmem index.

The only thing we sacrifice is ability to change refresh_patterns for
ICP/digests onthefly. For HTTP fetches this isn't a problem.

Even more. As at initial fetch time we predict future expiry time,
we can think of using delta stamps instead of precise times. We can
pick a reference time to be last clean swap.state, and define expiry
time in minutes into the future. This way we can define refreshing
time upto 45 days into the future by only 16bit u_short. During
clean swap.state rebuild, we can easily rewrite this expiry stamp
based on old and new swap.state date/time. We'd need to add this
timestamp to swap.state to avoid object reads and referring to
URL refresh_patterns during clean startups.

If this is a sane idea, then it looks to me like we could drop
16 bytes of timestamps down to 2...

------------------------------------
Andres Kroonmaa <andre@online.ee>
Delfi Online
Tel: 6501 731, Fax: 6501 708
Pärnu mnt. 158, Tallinn,
11317 Estonia
Received on Tue Jan 09 2001 - 01:54:34 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:13:14 MST