Updated patch attached for review.
On 27/11/2012 9:10 a.m., Alex Rousskov wrote:
> On 11/24/2012 08:27 AM, Amos Jeffries wrote:
>
>> This stage of the helper reply protocol adds kv-pair support to the
>> url_rewrite_helper interfacefor URL redirect and rewriteoperations.
>>
>> It uses the new Notes objects and kv-pair field added by the stage 2
>> helper protocol instead of parsing the 'other' field.
>>
>>
>> The response syntax for URL helpers becomes:
>>    [channel-ID SP ] result [ SP kv-pair ... ] [ SP other] EOL
>>
>> * 'other' field is now deprecated and will be ignored/discarded on any
>> response containing a result code field.
>>
>>
>> When result code "OK" is presented by the helper several kv-pairs are
>> reserved to control Squid actions:
>>
>> * "rewrite-url=" is added to return a re-written URL.
>>   - When this key is presented the URL is re-written to the new one by
>> Squid without client interaction.
>>   - The 'url' keys presence will override this key.
>>
>> * "url=" is added to return the redirected-to URL.
>>   - When this key name is presented an HTTP redirect response is
>> generated for the client.
>>   - This keys presence overrides the 'rewrite-url' key actions.
>>
>> * "status=" is added to hold the HTTP status code for use in redirect
>> operations.
>>   - This field is optional and status is no longer required for marking
>> redirect actions.
>>   - If no redirect status is provided Squid will assign one (currently
>> the default is 302, that may change in the future).
>>   - This key is only relevant when 'url' key is also presented. In all
>> other uses it is currently ignored.
>> When result codes BH or ERR are presented by the helper no redirect or
>> rewrite action is performed and no kv-pair key names are reserved for
>> use at this time.
> but any key=value pairs in a BH or ERR response are still parsed into
> notes and can be logged, right?
Right.
> Can we tell folks which key names are guaranteed not to clash with
> future Squid key names? For example, we can say that no Squid-supported
> key name will end with "_". Or we could recommend using helper-specific
> prefixes for custom key names instead.
Done with stage-2 in the wiki protocol documentation.
> N.B. We should not disclaim using X-* names because some of those (e.g.,
> X-Client-Username) are a de facto standard and might be used by Squid
> here eventually.
>
>
> Please note that helper annotations will _not_ be sent to adaptation
> services.
K. Added to the commit message.
>
>> Any other keys MAY be sent on any response. The URL helper interface
>> makes no other use of them, but this patch does pass them on to the ALE
>> object for logging as transaction Notes. All kv-pairs returned by the
>> helper (including the url, stauts rewrite-url keys) are available for
>> logging via the %{...}note log format option.
> What about the "note" ACL? Can it be used to match against
> helper-provided annotations? Or will the helper annotations be
> inaccessible to it? Please document one way or another.
>
> If the note ACL will not work for helper annotations, we may need
> "stage4". It may be better to fix that now instead, but that is your
> call provided you will make it work.
ACLs can access them from the HttpRequest object at any point after 
being added by the helper response. 'note' ACL is no exception when you 
get around to adding it.
>
>> +    switch(reply.result) {
>> +    case HelperReply::Unknown:
>> +    case HelperReply::TT:
>> +        debugs(85, DBG_IMPORTANT, "ERROR: URL rewrite helper returned invalid result code. Wrong helper? " << reply);
>> +        break;
>> +
> Please add a debugs() line to print reply.result before the above switch
> unless we print that elsewhere already. Not all cases inside the switch
> have debugs() lines so it would be nice to know what the switch value was.
The full reply object {result,notes{...},other()} is being displayed at 
level 85,5 at the top of the function.
>
>> +        debugs(85, DBG_IMPORTANT, "ERROR: URL rewrite helper: " << reply);
> Please add "redirect_fail_count" value to the above debug line or add a
> new debug line for it. It seems critical in understanding what would
> happen after the error.
Done. Suffixed ", attempt 1 of 2" / ", attempt 2 of 2"
>
>> +// XXX update to handle the new format responses...
>> +// #1: redirect with a specific status code    OK status=NNN url="..."
>> +// #2: redirect with a default status code     OK url="..."
>> +// #3: re-write the URL                        OK rewrite-url="..."
> based on the patch description, the above is already implemented, no?
Ah yes. Moved into OK case as documentation there.
>
>> +                char * result = const_cast<char*>(statusNote->values[0]->value.termedBuf());
>> +                status = (http_status) atoi(result);
> Please make result constant if possible.
Done.
>
>> +                char * result = const_cast<char*>(reply.other().content());
>> +                http_status status = (http_status) atoi(result);
> Please make both result and status constant if possible.
>
Done.
>
>> +                char * result = const_cast<char*>(statusNote->values[0]->value.termedBuf());
>> +                http->redirect.location = xstrdup(urlNote->values[0]->value.termedBuf());
>> +            if (urlNote != NULL && strcmp(urlNote->values[0]->value.termedBuf(), http->uri)) {
> xstrdup() calls exit() on NULL strings. strcmp() might dump core. The
> above lines may assume that termedBuf() never returns NULL. Please
> double check that "value" member cannot be an "undefined" String in
> these contexts. It is probably prudent to check for that centrally, when
> parsing annotations, if we are going to rely on that -- refuse to add
> undefined Strings to annotation values.
Fixed with use of the stage-2 firstValue() accessor which guarantees "" 
at minimum.
>
>> +    Notes metaNotes;         // collection of meta notes associated with this request.
> I find it ironic that you insisted on renaming "meta headers" but are
> now adding "meta notes" :-). Please consider using "notes" or
> "annotations" instead. The annotations are already "meta".
This linking of notes into HttpRequest was one of the reasons I insisted 
on removing the 'headers' part of the naming. It would have been hell 
tracking the difference between mimeHeaders and metaHeaders and 
virginHeaders and adaptedHeaders.
I never had anything against the 'meta' part other than it looked weird 
on the other type name choices.
Renamed to "helperNotes" for now to clarify they are notes from the helpers.
> The description seem to imply that metaNotes contains all annotations
> associated with this request, but I do not think this is true because
> some annotations are stored elsewhere. This is messy so let's document
> exactly which annotations are using this particular storage (I think it
> is currently used just for helper annotations, right?).
It *will* contain all the notes from all the helpers - when I get around 
to altering the other helpers. But this sponsor had specific target 
requirements and a short timeframe I did not get to make the same update 
on the other interfaces code (yet).
> Finally, I am concerned that we are adding Notes to all HttpRequests
> when most HttpRequest objects do not need annotations. Currently, Notes
> are implemented using Vector<> which has low performance overhead when
> unused. I am worried that if we later replace Vector<> with std::vector
> or similar, the inclusion of Notes will cause a performance hit. What do
> you think? Should we use a pointer to Notes here instead?
>
I was hoping to avoide all the extra checking pointers need and the 
duplicated if-NULL-allocate each helepr interface will need to perform. 
But I see your point. Changed to a Notes*.
>
>> +     * Adds a set of notes from another notes list to this set.
>> +     * Create any new keys needed.
>> +     * If the key name already exists in list, add the given value to its set of values.
>> +     */
>> +    void add(const Notes &src);
> Please polish the description to indicate that pointers to src notes
> and/or pointers to individual note values are added. Modifying src after
> add() may change the added notes as well!
And vice versa.
I have dropped the new-key import so that future merges of other helper 
values for the key does not alter the original src objects key. But the 
values inside the keys seem to be safe enough to leave as references and 
save a bit on memory. More on that below...
>
>> +    uint8_t redirect_fail_count;
> New member missing a doxygen description.
Added.
>> --- src/HttpRequest.cc	2012-10-27 00:13:19 +0000
>> +++ src/HttpRequest.cc	2012-11-24 14:55:43 +0000
>> @@ -114,6 +114,7 @@
>>       peer_domain = NULL;		// not allocated/deallocated by this class
>>       vary_headers = NULL;
>>       myportname = null_string;
>> +    metaNotes.clean();
>>       tag = null_string;
> HttpRequest::init() does not need to call clean() but
> HttpRequest::clean() does!
Resolved the particular issue with move to pointer. Adding a delete to 
HttpRequest::clean() and NULL-initialize to the constructors to resolve 
the logic flaw.
>
>> @@ -221,6 +222,7 @@
>>       // XXX: what to do with copy->peer_domain?
>>   
>>       copy->myportname = myportname;
>> +    copy->metaNotes.notes = metaNotes.notes;
>>       copy->tag = tag;
>>   #if USE_AUTH
>>       copy->extacl_user = extacl_user;
> Hm.. The generated Notes assignment operator will not really clone
> individual notes inside the "notes" vector/array. It will just copy
> pointers to them. Is that OK because we never modify a Note? I am not so
> sure. Perhaps an older Note will be modified when the "note" directive
> is checked, affecting cloned requests that should be independent from
> it? Should we prohibit Notes assignment and add an explicit
> Notes::clone() method that is guaranteed to clone Notes rather than copy
> their pointers?
If you think there is a case against retaining this memory linking I can 
add a Notes::clone()? But AFAICS there is no existing need for it (maybe 
in future ...).
Notes are read/append-only by design. No write/alter for existing 
values. So I don't see any issue here. The most I expect we will do is 
erase Note::Pointer linkages to a particular request object. And not 
even doing that specifically yet.
The model is that each HttpRequest down the transaction chain inherits 
from the one it was cloned from a set of read/append-only notes. Any 
helper or ACLs called may read some notes for their parameters and 
append new notes until the last post-adaptation HttpRequest object at 
the end of the transaction moves the whole set into ALE for logging.
The keys might have been altered later to add new values, I have 
replaced that copy/reference with new(). But the values themselves are 
read-only, so seem to be sfae enough and I have left as a refcounted 
pointer copy.
>
>> +                    char *t = NULL;
>> +
>> +                    if ((t = strchr(result, ':')) != NULL) {
> This can be merged into one line and simplified. You may also be able to
> declare "t" constant.
>
>    if (const char * t = strchr(result, ':')) { ...
Done.
Amos
This archive was generated by hypermail 2.2.0 : Fri Nov 30 2012 - 12:00:18 MST