Hey George,
First I would like to thank you for the idea in general.
In more detail I would like to give you couple notes:
since squid logs a lot of requests the current log analyzing tools use
static page creation which is not a wrong idea.
If you have a busy proxy and you need to statistics for a 20MB+ log file
it will take some time to just analyze all this data.
SNMP can be a bit tricky since it responses with the full counter per
client. ("Amount of total HTTP traffic to this client" from docs).
I think that the only case that you have in the SNMP more in the logs is
that if you have a LIVE session which was not ended yet then the SNMP
target will have the data which is not in the logs(dont remember
accurately).
OS-Level things is good and most of them can be fetched from cache-mgr
pages unless you want to know some stuff strait from the OS.
About the pages them-self I had some time to work with Rails which makes
a lot of webpages creation very easy.
For me PHP vs Ruby in general Ruby is default and after you will have
the logic of how logs and other stuff will be done I will be happy to
port stuff into Rails.
I do remember that there was some PHP German thing for squid access.log
analyzing on the fly but it is not maintained anymore.
Best Regards,
Eliezer
On 11/19/2012 2:05 AM, George Machitidze wrote:
> Hello
>
> I've started development of open sourced Web UI for gathering stats
> for Squid proxy server and need your help to clarify needs and
> resources.
>
> Where it came from:
> Enterprises require auditing, reporting, configuration
> check/visibility and statistics. I can say that most of these things
> are easy to implement and provide in different ways, except reporting
> and stats. Additionally, there are some requirements in functionality
> and nice interface not met by currently available solutions that I've
> found. Also, state of maintenance, future development etc are very
> unclear and Ineffective, but still acceptable or enough for _some_
> installations. If you know something that can do all this stuff -
> please let me know.
> So, I've decided to write everything from the scratch, maybe will take
> some public-licensed part from other projects.
>
> Architecture:
> Starting point is gathering stats, then we need to manipulate and
> store it, then we can add some regular jobs (will avoid this) and then
> we need to view this.
>
> Gathering data
> Available sources:
> 1. Logs, available via files or logging daemon (traffic, errors)
> 2. Stats available via SNMP (status/counters/config)
> 3. Cache Manager (status/counters/config)
> 4. OS-level things (footprint, processes, disk, cpu etc)
> [anything else?]
>
> This part will be done by local logging daemon, I won't use file
> logging for known reasons.
> BTW, good starting point is log_mysql_daemon by marcello, available in
> GPL, written in perl. Effective enough to start and load any data to
> DB - it's simple enough and took for me 10-15 minutes to analyze the
> code, setup and configure.
>
> Data storage
> File-based logging is very ineffective and has several huge disadvantages:
> - Ineffective use of disk resources
> - Poor/no indexing
> - Logrotation/DWH/archiving
> - Not human readable, some parts need calculations anyway
> - etc
>
> For optimized storing and then viewing of data It's actually required
> to have DB. For first step I'll use MySQL, then will migrate the code
> to support PgSQL (and maybe others too) through DB abstraction layer.
>
> We can store all of the access logs and also have some dynamically
> updated counters, because periodic jobs are very intensive and require
> time too.
>
> I don't want to put counter-updating code on the logging daemon, will
> try to use DB-side for that as it's done in log_mysql_daemon.
>
> If someone will need this data for monitoring purposes not available
> via SNMP/OS through Nagios/Cacti/Zabbix/whatever - I see no problem to
> do that too.
>
> Web UI
> Technologies: PHP/CSS/JS/Ajax etc
> PHP will select data from DB and generate pages accordingly.
>
> TODO:
> 1. Collect information about UI requirements - what users want to see
> and control
> 2. Define all the counters, logging variables for daemon part required
> for implementing first needs, according to P1
> 3. Define DB-side counters, sources
> 4. Check data types and lenght for DB for optimization
> 5. Continuous improvement
>
> Any involvement: information about user needs, suggestions,
> recommendations, coding, ideas are appreciated :)
>
> I chose GitHub for hosting the project, will write project docs and
> plans there. Currently I am collecting a very detailed information on
> user needs.
>
> Thanks
>
> Best regards,
> George Machitidze
>
-- Eliezer Croitoru https://www1.ngtech.co.il IT consulting for Nonprofit organizations eliezer <at> ngtech.co.ilReceived on Mon Nov 19 2012 - 00:36:07 MST
This archive was generated by hypermail 2.2.0 : Mon Nov 19 2012 - 12:00:04 MST