On 1/12/2012 1:31 p.m., Henrik Nordström wrote:
> fre 2012-11-30 klockan 15:30 -0700 skrev Alex Rousskov:
>
>>      Squid is sending POST requests on reused pinned connections, and
>> some of those requests fail due to a pconn race, with no possibility for
>> a retry.
> Yes... and we have to for NTLM, TPROXY and friends or they get in a bit
> of trouble from connection state mismatch.
>
> If sending the request fails we should propagate this to the client by
> resetting the client connection and let the client retry.
It seems to me we are also forced to do this for ssl-bump connections.
  * Opening a new connection is the wrong thing to do for server-first 
bumped connections, where the new connection MAY go to a completely 
different server than the one whose certificate was bumped with. We 
control the IP:port we connect to, but we cannot control IP-level load 
balancers existence.
  * client-first bumped connections do not face the lag, *BUT* there is 
no way to identify them at forwarding time separately from server-first 
bumped.
  * we are pretending to be a dumb relay - which offers the ironclad 
guarantee that the server at the other end is a single TCP endpoint (DNS 
uncertainty is only on the initial setup. Once connected packets reach 
*an* endpoint they all do or the connection dies).
We can control the outgoing IP:port details, but have no control over 
the existence of IP-level load balancers which can screw with the 
destination server underneath us. Gambling on the destination not 
changing on an HTTPS outbound when retrying for intercepted traffic will 
re-opening at least two CVE issues 3.2 is supposed to be immune to 
(CVE-2009-0801 and CVE-2009-3555).
Races are also still very possible on server-bumped connections if for 
any reason it takes longer to receive+parse+adapt+reparse the client 
request than the server wants to wait for. Remember we have all the slow 
trickle arrival of headers, parsing, adaptation, helpers and access 
controls to work though before it gets to use the pinned server conn. 
For example Squid is extremely likely to lose closure races on a mobile 
network when some big event is on that everyone has to 
google/twitter/facebook about while every request gets bumped and sent 
through an ICAP filter (BBC at the London Olympics).
>
>> When using SslBump, the HTTP request is always forwarded using a server
>> connection "pinned" to the HTTP client connection. Squid does not reuse
>> a persistent connection from the idle pconn pool for bumped client
>> requests.
> Ok.
>
>>   Squid uses the dedicated pinned server connection instead.
>> This bypasses pconn race controls even though Squid may be essentially
>> reusing an idle HTTP connection and, hence, may experience the same kind
>> of race conditions.
> Yes..
>
>> However, connections that were just pinned, without sending any
>> requests, are not "essentially reused idle pconns" so we must be careful
>> to allow unretriable requests on freshly pinned connections.
> ?
A straight usage counter is deftinitely the wrong thing to use to 
control this whether or not you agree with us that re-trying outbound 
connections is safe after guaranteeing teh clietn (with encryption 
certificate no less) that a single destinatio has been setup. What is 
needed is a suitable length idle timeout and a close handler.
  Both of which for bumped connections should trigger un-pinning and 
abort the client connection. If the timouts are not being set on 
server-bump pinned connections then that is the bug and needs to be 
fixed ASAP.
The issue is not that the conn was used then pooled versus pinned. The 
issue is that async period between last and current packet on the socket 
- we have no way to identify if the duration between has caused problems 
(crtd, adaptation or ACL lag might be enough to die from some race with 
NAT timeouts). Whether that past use was the SSL exchange (server-bump 
only) or a previous HTTP data packet. I agree this is just as much true 
on bumped connections which were pinned at some unknown time earlier as 
it is for connections pulled out of a shared pool and last used some 
unknown time earlier. Regardless of how the persistence was done they 
*are* essentially reused idle persistent connections. All the same 
risks/problems, but whether retry or alternative connection setup is 
possible differs greatly between the traffic types - with intercepted 
traffic (of any source) the re-try is more dangerous than informing the 
client with an aborted connection.
>
>> The same logic applies to pinned connection outside SslBump.
> Which it quite likely the wrong thing to do. See above.
>
> Regards
> Henrik
>
Amos
Received on Sat Dec 01 2012 - 03:27:32 MST
This archive was generated by hypermail 2.2.0 : Sat Dec 01 2012 - 12:00:32 MST