[squid-users] SQUID @ Satellite Networks using Page Accelerators from Francisco Ovalle - SMCC SE Mexico on 2001-09-17 (squid-users)

From: Francisco Ovalle - SMCC SE Mexico <Francisco.Ovalle@dont-contact.us>
Date: Mon, 17 Sep 2001 16:10:39 -0500 (CDT)

I'm trying to setup squid to be used as a Proxy Cache Server for a group of
computers connected to Internet via a Satellite link.

GLAT (the satellite provider) is using a product called Page Accelerator, who
runs in two peers. The first one is the RPA (Remote Page Accelerator), this
piece of SW runs on the remote side and works as a proxy server to the client
PCs, its main function is to ask the HPA (Hub Page Accelerator, a piece of SW
running at the HUB or ISP provider) to download an specific Web Page and
assemble all the pieces (HTML objects such as text, pictures, applets, etc), and
send them all together in a few data streams in order to optimize space bandwith
use (minimizing the quantity of TCP connections that must be opened for each
HTTP object).

The RPA then receives this few streams and pass them to the client PC.

This RPA SW doesn't do any caching, so here is where SQUID is being used.

The idea is to put the SQUID server between the PCs and the RPA, and make it
handle all the HTTP requests from the PCs and send them to the RPA, so when the
RPA receive the HTTP traffic from the RPA it can send them over to SQUID so it
can cache the content and accelerate the whole process the next time the same
page gets requested by any other client PC.

This is working now using the following rules in the squid.conf file:

cache_peer RPA_IP_address parent 9877 0 no-query default
acl all 0.0.0.0/0.0.0.0
always_direct deny all
never_direct allow all

So WHAT IS THE PROBLEM ???

The problem arises when the users try to load a page already stored in the
cache. According to me each time a user's browser try to access a web page that
has been already cached, SQUID connects to the original Web server to check if
the information has recently changed, if it hasn't changed then SQUID delivers
it's own cached copy, if it has changed, then SQUID gets all the new
information, reassembles the page, caches it and delivers it to the client's
browser.

The problem is that when SQUID connects to the internet to check if the page has
recently changed, it does it through the RPA (see rules above), the RPA doesn't
have a way to distinguish between this "checking procedure" and a new page
request, so it brings the whole page again and delivers it to SQUID, making all
the SQUID's caching feature worthless.

So as you can imagine all the connection process when using a SQUID Proxy Cache
is slower than not having SQUID at all!!!

THE QUESTION IS ???

How can I tell SQUID not to check for new changes all the time and deliver its
own copy without having to connect to the RPA each time a user asks for a web
page??
or
How can I tell SQUID to check for those new changes directly (not going through
the RPA) and if the pages have changed, retrieve them through the RPA so the
bandwith gets optimized?
or
Any one has any other Idea We could try for this architecture??

Thanks and my apologies for the long email.

Francisco

===========================================================
     ______ ..........................................
    /_____/\ .Ing. Francisco Ovalle F. .
   /_____\\ \ .Consultoria de Sistemas en Ecommerce .
  /_____\ \\ / .Sun Microsystems, México .
/_____/ \/ / / .Plaza Reforma .
/_____/ / \//\ .Prol. Paseo de la Reforma No. 600 - 210 .
\_____\//\ / / .Col. Peña Blanca Santa Fe .
\_____/ / /\ / .01210, México, D.F. .
  \_____/ \\ \ .Tel. 258-6134 Fax. 258-6199 .
   \_____\ \\ .e-mail: francisco.ovalle@Sun.COM .
    \_____\/ ..........................................
===========================================================
Received on Mon Sep 17 2001 - 15:10:48 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 17:02:13 MST