On 07/30/2013 08:03 PM, Alex Rousskov wrote:
> On 07/30/2013 10:04 AM, Amos Jeffries wrote:
>> On 31/07/2013 3:03 a.m., Tsantilas Christos wrote:
>>> 2) If configuration_includes_quoted_values is set to "on" (new style
>>> enabled) and token ends on a ( character, consider it as function name.
>>> If the token is "parameters" return FunctionParameters type else return
>>> FunctionNameUnknown type and parsing fails
>
>
>> This above logic still means that we are requiring all regex patterns to
>> be quoted strings whenever they contain () brackets around the final
>> segment of pattern.
>
> To be more accurate with regard to intent, the above implies that folks
> should not use new syntax (configuration_includes_quoted_values on) with
> regular expressions for now [until we add proper RE support].
>
> Quoting REs using regular strings is too painful and error prone. It is
> not recommended.
>
>
>> => If we are going to make any regex require special quoting, then I
>> think it worthwhile being consistent and making them all require quoting.
>> Are we agreed that making regex "-quoted is a good idea?
>
> We need special syntax for RE IMO. I like what Perl does because it is
> m/simple/ and m_at_flexible@, but perhaps there is a better way.
>
>
>> The '\' escape is already used internally by regex and layering multiple
>> levels of \-escaping gets nasty quite quickly.
This is true...
Maybe we can limit escaping to a set of characters.
For example if single quotes used, escape only the ' character and if
double quotes used allow escaping " % and $ characters
>
> Agreed.
>
>
>
>>> 3) If configuration_includes_quoted_values is set to "off" (new style
>>> enabled) and token ends on a (, if the token is "parameters" return
>>> FunctionParameters type, else do not end the token on '(' but to the
>>> next whitespace character.
>>> For example for the string "test(test) more" will return as token the
>>> "test(test)"
>>
>> config set to "off" is legacy parser. Typo in your description?
Yes!
>>
>> If quoted-values is OFF. I would be expecting to get the token...
>> "test(test) ... notice the missing end-quote since start-quote should
>> have been ignored and treated as part of the single word token.
>
> I agree with Amos here. Legacy parsers should continue function as
> before quoted strings changes if possible. We do not even need to (and
> perhaps should not) support parameters() syntax in a legacy parser. Is
> it possible to give folks true legacy behavior?
In my example the quotes should not exist. What I want to say in my
example is that for the following line:
test(test) more
The NextToken will return test(test).
However Amos refers to an other case. For the following line:
"Simple Tokens"
we may want to retrieve the token "Simple
Do we have any example where this is required? (Not for regex, for regex
we have an exception...)
>
>
>>> For the new style if someone want to use '(' character on a regex for
>>> example, he should use quotes:
>>> 'test(.*/)'
>>> "test(.*/)"
>
> Double quotes do not work well because REs use backslashes of their own.
> It becomes too messy as Amos noted above. I think we should not support
> REs in a new parser until we add Perl-like RE quoting (or better).
OK. This is something we can do it.
We can add a method ConfigParser::NextWord which will return a raw word
which will consist by any non whitespace character. Whitespaces can be
escaped.
Or use an On/Off global variable which enables/disables for the next
token this behaviour....
>
>
>>> If someone does not want interpret macros he should use single quotes:
>>> 'test(.*)\/$'
>
>
>> I was going to suggest removing the single-quoted strings support as
>> well to avoid letting people quote randomly with either. But with the
>> regex problems I think we should leave that in as the method to prevent
>> \-unescaping and document that regex patterns be single-quoted strings
>> while other tokens be double-quoted.
>
> I am OK with leaving single-quoted string support in provided we agree
> on how backslashes are handled inside single quotes. If they are ignored
> (as all other special characters), then please note that REs (and other
> single-quoted tokens) will not be able to contain single quotes. If they
> are not ignored, we have a problem with REs as with double quotes...
>
If we limit the escaping to some special chars? Can this work?
>
> Cheers,
>
> Alex.
>
>
Received on Tue Jul 30 2013 - 17:46:57 MDT
This archive was generated by hypermail 2.2.0 : Wed Jul 31 2013 - 12:00:07 MDT