Re: [MERGE] c++-refactor HttpHdrCc

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Mon, 26 Sep 2011 17:00:26 -0600

On 09/26/2011 02:17 PM, Kinkie wrote:

> 12 days have passed, no further discussion. Does that mean Ok to merge?

I may have found a bug in the latest patch. Sent a review separately. I
think it answers your question below. Sorry it took so long.

Cheers,

Alex.

> In the meantime I've implemented full getters/setters for all valued
> headers, with explicitly defined "unset" values.
> Using bools to implement the other flags would probably be
> counterproductive as it'd baloon the code size even more.
>
> Two open points:
> - As I've mentioned, I've implemented a small const char*/lenght class
> to use as key in a map. Should it be extracted to be used also
> elsewhere? Any candidates for a name?
> - cleaning up the for-review changes.
>
> Thanks for any comments.
>
>
> On Wed, Sep 14, 2011 at 8:30 PM, Kinkie <gkinkie_at_gmail.com> wrote:
>> On Tue, Sep 13, 2011 at 6:11 PM, Alex Rousskov
>> <rousskov_at_measurement-factory.com> wrote:
>>> On 09/12/2011 05:38 PM, Kinkie wrote:
>>>> Hi,
>>>> the attached bundle refactors the HttpHdrCc into a c++ class,
>>>> attempting also to clean the code up. Main changes:
>>>> - 6 less global functions in protos.h
>>>> - 1 less struct in structs.h
>>>> - proper c++ construction/destruction semantics, standard MemPool integration
>>>
>>>> - name->id header parsing now done via std::map instead of iterating
>>>> over list: complexity should go from O(n) to O(log n) with STL-derived
>>>> libs.
>>>
>>> On the other hand, you now deallocate and allocate a new SquidString for
>>> every CC directive just to find that directive ID so the total cost may
>>> be not be lower. I think we can do much better, and even if we cannot or
>>> will not, I do not think we should commit the "optimization" part of
>>> this change until proper performance regression testing.
>>
>> Point taken. I've implemented a small helper class (basically a const
>> char*+length) to support the lookup in a very cheap way. I'd like to
>> get your feedback on it before trying to extend its use (it's
>> currently an in-cc-file definition).
>>
>>> And if you want to mix optimization with cleanup, here is one place
>>> where you can save quite a few cycles for many messages by not ignoring
>>> the return value of getList():
>>>
>>>> getList(HDR_CACHE_CONTROL, &s);
>>>>
>>>> - cc = httpHdrCcParseCreate(&s);
>>>> + cc=new HttpHdrCc();
>>>> + if (!cc->parseInit(s)) {
>>>> + delete cc;
>>>> + cc = NULL;
>>>> + }
>>
>> Done.
>>
>>>> +/** Http Cache-control header representation
>>>> + *
>>>> + * Store, parse and output the Cache-control HTTP header.
>>>
>>> In comments, please capitalize HTTP Cache-Control the way RFC 2616 does.
>>
>> Done.
>>
>>
>>>> + * Store, parse and output the Cache-control HTTP header.
>>>
>>> I do not think your HttpHdrCc class deals with the output. On the other
>>> hand, please consider converting httpHdrCcPackInto() and
>>> httpHdrCcUpdateStats() as well.
>>
>> I've so far chosen not to do it as it doesn't look strictly necessary
>> and it increases coupling.
>>
>>>> -HttpHeaderFieldInfo *CcFieldsInfo = NULL;
>>>> +/// Map an header name to its type, to expedite parsing
>>>> +typedef std::map<String,http_hdr_cc_type> HdrCcNameToIdMap_t;
>>>> +static HdrCcNameToIdMap_t HdrCcNameToIdMap;
>>>
>>> HdrCcNameToIdMap should be constant so that we do not accidentally
>>> modify it. You can make it a pointer and use similar-to-current
>>> initialization code or try to initialize statically.
>>
>> Are you sure we run such a risk? I've now changed it so that it is a
>> static data member with const key. Do you think it could be enough?
>>
>>>> + HdrCcNameToIdMap_t::iterator i;
>>>
>>> The above variable can be marked as const.
>>
>> Done.
>>
>>>> +static HdrCcNameToIdMap_t HdrCcNameToIdMap;
>>>
>>> Consider dropping the "Hdr" prefix since you are inside a HdrCc file
>>> already. This will also make it consistent with CcAttrs name.
>>
>> Done.
>>
>>>> void
>>>> +HttpHdrCc::clear()
>>>> +{
>>>> + mask=0;
>>>> + max_age=-1;
>>>> + mask=0;
>>>> + max_age=-1;
>>>> + s_maxage=-1;
>>>> + max_stale=-1;
>>>> + stale_if_error=0;
>>>> + min_fresh=-1;
>>>> + other.clean();
>>>> +}
>>>
>>> This duplicates the default constructor. Mask and max_age are cleared
>>> twice. Consider this implementation instead (which can be inlined):
>>>
>>> *this = HttpHdrCc();
>>
>> Done. Clever.
>>
>>>> + int32_t i;
>>>> + /* build lookup and accounting structures */
>>>> + for (i=0;i<CC_ENUM_END;i++) {
>>>
>>> The "i" variable should be local to the loop. Please consider adding
>>> spaces around operators.
>>
>> Done.
>>
>>>> + assert(i==CcAttrs[i].id); /* verify assumption: the id is the key into the array */
>>>> + HdrCcNameToIdMap[CcAttrs[i].name]=CcAttrs[i].id;
>>>
>>> Consider computing CcAttrs[i] once instead of three times and storing it
>>> as a local const reference.
>>
>> Done.
>>
>>>> + assert(i==CcAttrs[i].id); /* verify assumption: the id is the key into the array */
>>>
>>> Where do we actually use that assumption?
>>
>> HttpHeaderCcFields is not const. Its stat field member can be updated
>> (look for ++CcAttrs[type].stat.repCount). In other words, it doubles
>> as an array-type data-store, indexed on the header type numeric value.
>> This assertion validates that.
>>
>>>> - if (EBIT_TEST(cc->mask, type)) {
>>>> + if (EBIT_TEST(mask, type)) {
>>>> - EBIT_SET(cc->mask, type);
>>>> + EBIT_SET(mask, type);
>>>> - if (!p || !httpHeaderParseInt(p, &cc->max_age)) {
>>>> + if (!p || !httpHeaderParseInt(p, &max_age)) {
>>>> - cc->max_age = -1;
>>>> - EBIT_CLR(cc->mask, type);
>>>> + max_age = -1;
>>>> + EBIT_CLR(mask, type);
>>>> - if (!p || !httpHeaderParseInt(p, &cc->s_maxage)) {
>>>> + if (!p || !httpHeaderParseInt(p, &s_maxage)) {
>>>
>>> All of the above (and more) are essentially non-changes. Please consider
>>> making "cc" equal to "this" to avoid them for the purpose of the review,
>>> so that we can pinpoint the real changes. That hack should be removed
>>> before the final commit, of course.
>>
>> Ok.
>>
>>>> + * Created on: Sep 2, 2011
>>>> + * Author: Francesco Chemolli
>>>
>>> I would recommend removing that because there are many authors for that
>>> code. You have my permission to remove my Author tag in src/HttpHdrCc.cc
>>> as well.
>>
>> Done.
>>
>>>> +class HttpHdrCc
>>>> +{
>>>> +
>>>> +public:
>>>> + int32_t mask;
>>>> + int32_t max_age;
>>>> + int32_t s_maxage;
>>>> + int32_t max_stale;
>>>> + int32_t stale_if_error;
>>>> + int32_t min_fresh;
>>>> + String other;
>>>> +
>>>> + HttpHdrCc(int32_t max_age_=-1, int32_t s_maxage_=-1,
>>>
>>> Please move public data member after public methods. It would be nice to
>>> document them too, especially those that do not correspond to CC
>>> directives. Also, check whether we really need all of the above,
>>> especially mask, as public fields (more on that later below).
>>
>> Done. Documented "mask" and "other", the others should be straightforward.
>> public mask could be avoided if we do full getters/setters, but doing
>> that elegantly would be scope-balooning.
>>
>>>> + HttpHdrCc(int32_t max_age_=-1, int32_t s_maxage_=-1,
>>>> + int32_t max_stale_=-1, int32_t min_fresh_=-1) :
>>>
>>> Do we really all five of these constructors, including the one that
>>> allows for implicit conversion from int32_t to HttpHdrCc? I do not think
>>> so. Please leave just what we need and use "explicit" to prevent
>>> implicit conversions as needed.
>>
>> Overengineering. Changed to only the default contstructor (we need it
>> to set various fields to "-1" by default).
>> Ok for explicit.
>>
>>> As far as I can see, we just need the default constructor, but my quick
>>> check may be missing some cases:
>>>
>>>> $ fgrep -RI httpHdrCcCreate src
>>>> src/HttpHdrCc.cc:httpHdrCcCreate(void)
>>>> src/HttpHdrCc.cc: HttpHdrCc *cc = httpHdrCcCreate();
>>>> src/HttpHdrCc.cc: dup = httpHdrCcCreate();
>>>> src/http.cc: cc = httpHdrCcCreate();
>>>> src/protos.h:SQUIDCEXTERN HttpHdrCc *httpHdrCcCreate(void);
>>>> src/mime.cc: reply->cache_control = httpHdrCcCreate();
>>
>> Yes.
>>
>>>> + /** reset the structure to a clear state.
>>>> + *
>>>> + */
>>>> + void clear();
>>>
>>> The "clear state" is an undefined concept. Consider a more specific
>>> description such as:
>>> /// reset to the after-default-construction state
>>>
>>> (and there is probably a more elegant way of saying that).
>>
>> How about "reset all data members to default state"? Otherwise I'll
>> use your suggestion.
>>
>>>> + * \note: internal structures are not cleaned-up beforehand.
>>>> + * caller must explicitly clear() beforehand if he wants that
>>>> + */
>>>> + bool parseInit(const String &s);
>>>
>>> It is wrong that *Init() does not initialize. If this is not your fault
>>> and you do not want to fix it, then mark with XXX.
>>
>> That's a direct derivative of the name it has now.
>> I'll rename it to parse.
>>
>>>> + /** set the max_age value
>>>> + *
>>>> + * \param max_age the new max age. Values <0 clear it.
>>>> + */
>>>> + void setMaxAge(int32_t max_age);
>>>> +
>>>> + /** set the s-maxage
>>>> + *
>>>> + * \param s_maxage the new max age. Values <0 clear it.
>>>> + */
>>>> + void setSMaxAge(int32_t s_maxage);
>>>
>>>
>>> It feels wrong to have many public fields and [only] these two methods.
>>> We should probably hide all fields and provide the corresponding name()
>>> and name(int32_t) accessors and setters to those fields that need them.
>>
>> Direct refactoring artifact.
>> However, setSMaxAge is not used anywhere, so I'll remove it. Are you
>> sure it's needed to set up a full getter/setter structure for what is
>> in essence a glorified struct? It really feels like over-engineering.
>>
>>>> === modified file 'src/HttpHeader.h'
>>>> --- src/HttpHeader.h 2011-06-23 08:31:56 +0000
>>>> +++ src/HttpHeader.h 2011-09-12 22:54:38 +0000
>>>> @@ -37,6 +37,7 @@
>>>> #include "HttpHeaderRange.h"
>>>> /* HttpHeader holds a HttpHeaderMask */
>>>> #include "HttpHeaderMask.h"
>>>> +#include "HttpHdrCc.h"
>>>>
>>>>
>>>
>>> Is this addition needed? Please use "class HttpHdrCc;" pre-declaration
>>> instead if possible.
>>
>> Removed the include. Class forward declaration was already in place.
>> This unveiled a few other needed include-places which were indirectly
>> included.
>>
>> Attached is iteration 2 of the patch, build- and run-tested (not ready
>> for merge, but for a second validation round).
>>
>> Thanks
>>
>>
>> --
>> /kinkie
>>
>
>
>
Received on Mon Sep 26 2011 - 23:01:20 MDT

This archive was generated by hypermail 2.2.0 : Tue Sep 27 2011 - 12:00:03 MDT