On Mon, 10 Apr 2000, Dancer wrote:
> However, let's download...oh, I don't know...random.exe from a server.
> http://www.somewhere.com/random.exe
>
> We store that object.
>
> Someone else comes along and gets
> http://msdownload.com/4598AB024985700FF/Considerable_unpredictable_guff/RANDOM.EXE
Yeah, that's why I'd propose having the administrator specify a list of
sites known to have identical content, and changing the URL prior to
taking the MD5 to a "common" one without touching the actual download or
storage process.
It could look like this in squid.conf:
multisite 1 ^http://ms....\.www\.connxion\.com/
multisite 1 ^http://msdownload\.somesite\.be/
multisite 1 ^http://msdownload\.somesite\.ch/
where the entire matched portion of the URL would be replaced by a string
of the form "\001multisite1\001" for MD5 generation purpose (the
control/A's taking care of putting illegal characters in the URL to avoid
rewriting the URL to a possibly valid URL).
I completely agree that Squid should not secondguess the web designer, and
that the burden of verifying that sites have identical content should be
on the cache admin.
Cheers,
-- Bert
Bert Driehuis, MIS -- bert_driehuis@nl.compuware.com -- +31-20-3116119
Every nonzero finite dimensional inner product space has an
orthonormal basis. It makes sense, when you don't think about it.
Received on Mon Apr 10 2000 - 01:38:09 MDT
This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:12:22 MST