>At 10:31 18.08.1996 +0200, you wrote:
>>
>>On Sun, 18 Aug 1996 tomaz@mail.siol.net wrote:
>>
>>> > Are the logfile analysis scripts that I find on www.nlanr.net
>>> > compatible with an access.log made from Squid 1.0.4? I try
>>> > access-extract.pl < access.log > summary
>>> > and the summary contains just a few lines, looks like it
>>> > did not find any entries.
>>> > I'm using Perl 5.003.
>>
>>Look at your squid configuration. Maybe, you have the emulate_httpd_log
>>option enabled (that's the default!). If so, you have to start
>>access-extract.pl and access-extract-urls.pl with the option -h. Look at
>>
>Attention! 'emulate_httpd_log' changed from 'on' in 1.0.x to 'off'
>in 1.1.alphaX. All my statistics (pwebstats!) is broken.
I guess that's my cue (as the author of pwebstats). Pwebstats will be
supporting the squid native log format when I get the time to make the
changes (among many, many other changes). In the meantime, here is a perl
script that will convert native squid logs into cern-style cache hit and
miss logs in common log format.
------------( start native2common.pl )-----------
#!/usr/local/bin/perl
################################################################################
#
# native2common.pl : convert squid native log file into cern-style
# cache hit & cache miss logs (in common log format)
#
# Martin Gleeson <gleeson@unimelb.edu.au>, August 1996
#
# (c) Copyright, The University of Melbourne, 1996
#
################################################################################
#------------ You must set the following variable -----------------------------#
$gmtoffset = "+1000"; ###### Offset from GMT
#------------------------------------------------------------------------------#
# uncomment the following line and the ones marked STOPLIST below if you want
# to ignore particular IP numbers in the log - e.g. ignore neighbours if you
# only want stats for local clients. Variable points to a file consisting of
# a single line with a list of one or more IP numbers, delimited by the 'pipe'
# symbol, e.g.:
# 123.45.67.89|234.56.78.90|234.5.6.78
# $stoplist = "/servers/http/squid/etc/neighbours";
#------------------------------------------------------------------------------#
$address_type=2;
%months = ( '0','Jan', '1','Feb', '2','Mar', '3','Apr', '4','May', '5','Jun',
'6','Jul', '7','Aug', '8','Sep', '9','Oct', '10','Nov', '11','Dec');
%longmonths = ( '0','January', '1','February', '2','March', '3','April',
'4','May', '5','June', '6','July', '7','August',
'8','September', '9','October', '10','November',
'11','December');
#------------------------------------------------------------------------------#
$usage = "usage: native2common.pl <logfile>\n";
if( ! $ARGV[0] ) { die $usage; }
$logfile = shift ( @ARGV );
#------------------------------------------------------------------------------#
($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);
$year += 1900;
$time_now = "$hour:$min:$sec on $mday $longmonths{$mon} $year";
printf STDERR
"==============================================================\n";
printf STDERR "native2common.pl started on $logfile at $time_now.\n";
printf STDERR
"==============================================================\n";
printf STDERR "\n";
open(COUNT,"/bin/wc -l $logfile |");
while( <COUNT> ){ chop; ($line_count) = /^\s+(\d+)\s+\S+$/; }
close(COUNT);
$inc = sprintf "%d", ( $line_count / 50 );
print STDERR " The logfile has $line_count entries.\n";
print STDERR " Processing...\n";
print STDERR " 0% 50% 100%\n";
print STDERR " |-----------------------|------------------------|\n ";
$counter=0; $hash_counter=0;
$counter=$linecount=0;
# >>STOPLIST<< Uncomment these lines for the stoplist function
# open(STOPLIST,"$stoplist");
# $stops = <STOPLIST>; chop($stops) if($stops =~ /\n/);
# close(STOPLIST);
open(LOG_FILE,"$logfile");
open(PROXY_FILE,">> proxy.convert");
open(CACHE_FILE,">> cache.convert");
while(<LOG_FILE>){
$linecount++;
$counter++;
if( $counter >= $inc )
{
$counter = 0;
$hash_counter++;
printf STDERR '#' if( $hash_counter <= 50 );
}
# split the input line into its various components
chop;
@line = split(/\s+/,$_,7);
$time = $line[0];
$elapsed = $line[1];
$host = $line[2];
$codes = $line[3];
$size = $line[4];
$htype = $line[5];
$url = $line[6];
# >>STOPLIST<< Uncomment this line for the stoplist function.
# next if( $host =~ /$stops/);
($seconds,$milliseconds) = split(/\./,$time);
($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) =
localtime($seconds);
$year += 1900;
if($mday < 10){ $mday = "0" . "$mday";} if($hour<10){ $hour="0" .
"$hour";}
if($min < 10){ $min = "0" . "$min"; } if($sec<10){ $sec="0" . "$sec"; }
$www_date = "$mday/$months{$mon}/$year:$hour:$min:$sec $gmtoffset";
($type,$code,$remotetype) = split(/\//,$codes);
# convert IP into hostname
if( $hosts{$host} ) { $name = $hosts{$host};}
else
{
if($host =~ /\d\d\.\d\d/)
{
@address = split(/\./,$host);
$addpacked = pack('C4',@address);
($name,$aliases,$addrtype,$length,@addrs)
= gethostbyaddr($addpacked,$address_type);
}
if ( $name eq "") { $name = $host };
$name = "\L$name";
$hosts{$host} = $name;
}
if( $type eq "TCP_DENIED" ){ # fetched from external source
$line_new = "$hosts{$host} - - [$www_date] \"$htype $url
HTTP/1.0\" 401 $size\n";
print PROXY_FILE "$line_new";
$sizeproxied += $size;
$totalproxied += 1;
}
if( $type eq "TCP_MISS" || ($type eq "TCP_IFMODSINCE" && $size < 220)
||$type eq "TCP_EXPIRED" || $type eq "TCP_REFRESH"
|| $type eq "TCP_SWAPFAIL"){ # fetched from external source
$line_new = "$hosts{$host} - - [$www_date] \"$htype $url
HTTP/1.0\" 200 $size\n";
print PROXY_FILE "$line_new";
$sizeproxied += $size;
$totalproxied += 1;
}
elsif( $type eq "TCP_HIT" || ($type eq "TCP_IFMODSINCE" && $size
>= 220) ){
if( $size != 0 ) {
$line_new = "$hosts{$host} - - [$www_date] \"$htype
$url HTTP/1.0\" 200 $siz
e\n";
print CACHE_FILE "$line_new";
$sizecached += $size;
$totalcached += 1;
}
}
}
if( $hash_counter < 50 )
{
while( $hash_counter <= 50 )
{
$hash_counter++;
printf STDERR '#';
}
}
printf STDERR "\n";
printf STDERR "\n";
close(LOG_FILE);
close(PROXY_FILE);
close(CACHE_FILE);
printf STDERR "%s lines processed.\n\nTotal hits: Proxy %s requests, Cache
%s requests.\n\n",
&commas($linecount),&commas($totalproxied),&commas($totalcached);
printf STDERR "Total sizes: Proxy %s bytes, Cache %s bytes\n\n",
&commas($sizeproxied),&commas($sizecached);
if( ($totalproxied != 0 || $totalcached != 0) && ($sizeproxied!=0 ||
$sizecached!=0)){
printf STDERR "Hit rates: %s%% requests, %s%% bytes\n\n",
&commas( ($totalcached/($totalproxied+$totalcached)) * 100),
&commas( ($sizecached/($sizeproxied+$sizecached)) * 100);
}
($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);
$year += 1900;
$time_now = "$hour:$min:$sec on $mday $longmonths{$mon} $year";
printf STDERR
"==============================================================\n";
printf STDERR "native2common.pl finished on $logfile at $time_now.\n";
printf STDERR
"==============================================================\n";
exit(0);
sub commas {
local($_)=@_;
$_ = sprintf "%ld", $_;
1 while s/(.*\w)(\w\w\w)/$1,$2/;
$_;
}
------------( end native2common.pl )-----------
Cheers,
Marty.
-------------------------------------------------------------------------
Martin Gleeson Webmeister | http://www.unimelb.edu.au/%7Egleeson/
Information Technology Services | Email : gleeson@unimelb.edu.au
The University of Melbourne, Oz. | Opinions : Mine, all mine.
"I hate quotations." -- Ralph Waldo Emerson, Journals (1843)
-------------------------------------------------------------------------
Received on Mon Aug 19 1996 - 14:41:18 MDT
This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:32:49 MST