Hello,
I am new to Squid. I am developing a program in order to simulate 
the performance of different cache replacement policies of proxies. 
However, being a newbie to the field, I need to check the correctness 
of my reasoning and assumptions. Consequently, in the following 
paragraphs, I am exposing my objectives and my line of thoughts and 
I'll be most grateful to anybody who can answer my queries and 
correct my mistakes.
I have at my disposal the access logs of a Squid 1.1 in native 
format, i.e., "time elapsed remotehost code/status bytes method URL 
rfc931 peerstatus/peerhost". The proxy from which I obtained these 
logs was configured as a peer cache. My aim is to use these logs in 
order to obtain the following:
1] the requested URL
2] the time/date of the request
3] the size of the document returned to the client
I am not interested in knowing anything about the clients making the 
requests. I am also not taking into consideration the consistency of 
documents.
Is it possible to extract the above mentioned data from the native 
format of access logs?
Does Squid make real time decisions about which document to cache, 
i.e., is it correct to assume that, to every requested document, 
there is or there will eventually be, a copy in a cache somewhere, 
albeit the local proxy cache or a parent cache?
Going back to the native format of the access log of Squid 1.1, 
"time elapsed remotehost code/status bytes method URL rfc931 
peerstatus/peerhost", the "time" field will give me the "time/date of 
the request" (albeit as UNIX time stamp). I can get the "requested 
URL" from the "URL" field. But what about the "size of the document 
returned to the client". The "bytes" field IS NOT the size of the 
returned document, at least not under all circumstances. However, 
from what I understand, there are situations where the "byte" field 
will be the "size of returned document" (perhaps when I have a 
TCP_HIT). Am I right? For example, when I have an ICP_QUERY with a 
UDP_MISS, the "bytes" field must be interpreted differently.
Thus, in order to retrieve those records where the "byte" field 
represent the size of the returned document to the client, I need to 
look for particular code/status combinations. Is that correct? Do I 
need to look for particular peerstatus/peerhost  combinations 
also?Assuming that my above reasoning is correct,  I need help to 
interpret the code/status combinations, i.e., I don't know which 
code/status combinations to look for? 
Last but not least, I thank anybody who can pull me out of this pool 
of ignorance.
Regards, 
Gawesh
+++++++++++++++++++++++++++++++++++++++++++++++++++
 Gawesh C JAWAHEER                               
 MSc Information Systems and Technology, 1997/98 
 City University, UK                             
+++++++++++++++++++++++++++++++++++++++++++++++++++
Received on Mon Sep 28 1998 - 08:12:53 MDT
This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:42:12 MST