On 12/02/2012 5:05 p.m., Pieter De Wit wrote:
> <snip>
>> * the parsing bottleneck gets crunched several times: on first 
>> arrival, in the ICAP server, and on return to Squid,
>> * the ICAP server bypass optimization can't be used since quote needs 
>> to measure every byte,
>> * tunneled data does not get sent to ICAP services,
>>
>> Not exactly perfect service, but it offers the most complete quota 
>> control without adding complexity to Squid.
>>
>> eCAP might be a slightly better. It sill runs inside Squid and has 
>> some processing overhead, but should reduce the parse problems and 
>> network delays involved with ICAP.
>>
>>>
>>> Points to reading URL's are more than welcome, also, so is examples 
>>> of libicapapi :)
>>
>> Hopefully someone else knows some then, because I dont :(
>>
>> Amos
> Hi Amos,
>
> You said that you proposed some work a while ago, would you mind 
> sharing that? I gave the network thing some thoughts and I can see how 
> the delay would hurt squid. I kept on comparing it to milters, but 
> these don't mind a few ms delay, email is a lot less interactive.
>
> The thought process I am going with is something along the lines of a 
> process that is "spoken" to, like ecap perhaps, via pipes or a lib or 
> some such. This process will be notified based on the following:
>
> (* - Request, **-Reply)
>
> * I would like to go to protocol://site
> ** Is there quota left to allow this, if the user has 0 quota left, 
> block the request, no use
> * The server said the object is X bytes long, can I continue to 
> download it
> ** Yes, there is quota. The problem comes in if the server didn't give 
> a length, if that is the case, perhaps only allow 1024 bytes until his 
> quota runs out. There is also the problem if the server said the 
> object is bigger than it really is...
> * Can I sent the following 1024 bytes
> ** Yes, there is quota.
>
> At any given step, if the quota runs out, the connect is aborted. This 
> will involve some tie in with the FD struct that you guys have 
> already. I do recall myself and Alex having a chat about this. I 
> referred to it as "hooks" into the FD struct. I *think* the talk about 
> "hooks" in the FD struct was aborted because it didn't add enough 
> value at the time, or real life caught up to me or or or :)
The download even if known-length can be aborted at any time, also the 
backend system may change the quota at any time as well.
So IMO the best idea is to collpase the requests all down to a request 
asking for N bytes and passing along any parameters which the quota 
backend needs.
The basic idea was started here:
   http://bugs.squid-cache.org/show_bug.cgi?id=1849
Looking back at the discussion thread it was started by you in Feb 2009 
the model description is here 
http://marc.info/?l=squid-dev&m=123570800116923&w=1. Although it seems I 
sent you something in private before that with more details. Sorry that 
mail is gone now.
The Measurement Factory have since created the client_delay_pool part of 
it but without any helper hooks. So the current is only /sec capping. 
Adding a helper API hook that sets the client DB  quota field values and 
updates it when exhausted
That is fully controllable already with per-request limitations and speeds.
The big cases that are left is fixed-size quotas that run down. No need 
for lookups with details from particular headers or such at this point.
>
> Based on this, I would like to re-float the idea of "hooks" in the FD 
> struct.  From the top of my head, one would have modules that expose 
> certain function/procedures:
The FD struct is on the hitlist for erasure or at least removing 
anything that is not particularly directly related to the FD value. The 
Comm layer has been restructured in Squid-3.2+ into a set of dynamically 
created listener Jobs (TcpAcceptor) which spawn traffic handler 
AsyncCalls based on the http(s)_port settings. The hooks would be best 
being added into the call sequence and run out of those traffic handler 
functions. Incidentally that would be...
>
> OnClientConnect (source_ip,source_port,target_ip,target_port);
This would be httpAccept(), httpsAccept() in client_side.cc where the 
client DB entry is created/updated. It would need the config settings to 
handle being limited to the TCP level details available here, with no 
request details.
  The main idea behind using a helper, was that we can completely avoid 
the work of figuring out generically useful config directives. Just pass 
the TCP details to the helper and let the admin decide which are used 
and how.
> OnClientRequest (URL);
> OnClientRequestContent (content,size,offset);
The code structure allows for a hook after the request headers are fully 
received and parse completed in the doCallouts(). The earlier processing 
is locked inside some annoying loops. I'm hoping to kill those, but that 
will take a while. For now we are stuck with doCallouts() being the 
start- and end-all of request processing.
> OnClientResponse (URL,size);
> OnClientResponseContent (content,size,offset);
Squid offers http.cc processRequest() for hooks after the response 
headers have been parsed.
> OnClientDisconnect (<not sure>);
>
> I will outright say, I have no clue how modules work (thinking about 
> apache etc) and these are shamelessly based on my Delphi XP with Objects.
The hardest part is making the hook on quota runout work cleanly. Have a 
good look at client_db.cc for how the "quota" stuff in there works already.
>
> Cheers,
>
> Pieter
>
> P.S. Might be worth starting a new thread perhaps ?
Same topic though. Rename?
Amos
Received on Sun Feb 12 2012 - 06:18:19 MST
This archive was generated by hypermail 2.2.0 : Sat Feb 18 2012 - 12:00:06 MST