develooper Front page | perl.perl5.porters | Postings from May 2010

Re: PATCH: [perl #58182] partial, "The Unicode Bug". Add unicodesemantics for \s, \w

Thread Previous | Thread Next
From:
karl williamson
Date:
May 19, 2010 22:08
Subject:
Re: PATCH: [perl #58182] partial, "The Unicode Bug". Add unicodesemantics for \s, \w
Message ID:
4BF4C3D5.5070409@khwilliamson.com
Curtis Jewell wrote:
> On Wed, 19 May 2010 22:51 +0100, "Paul LeoNerd Evans"
> <leonerd@leonerd.org.uk> wrote:
>> On Tue, May 11, 2010 at 12:54:01PM -0600, karl williamson wrote:
>>> These commits also add regex modifiers /u (unicode), /l (locale), and /t
>>> (traditional).  /a is not part of this patch.  I have made up the term
>>> "Matching mode" to describe this.  I'm open to a better term, if you can
>>> think of one.
>> It may perhaps be far too late to reconsider, but I'm not sure I like
>> these notations. These are three mutually-exclusive settings along one
>> axis, they are not three independent settings on three different axes,
>> such as /l vs /g.
>>
>> Would it not make more sense to group them up under a single /u flag,
>> something of the following:
>>
>>  m/Unicode on/u
>>  m/Unicode off/u0
>>  m/Unicode if locale says/ul
>>  m/Unicode traditionally/ut
> 
> We do have the assumption that capital letters oppose their lowercase
> counterparts, as far as I can tell, so that the first two would be
> 
> m/Unicode on/u
> m/Unicode off/U
> 
> (I'm making the assumption we're adding a /U with that /u.)
> 
> The question is, are the other two on an axis where we can say "/l
> applies only if /u, and /uL would be the equivalent of the proposed /t
> option?"
> 
> (i.e. is locale/traditional a two state, rather than locale/something
> else/traditional being 3-state?)

It is tri-state, with each value excluding the other two, and maybe a 
fourth value will be added to make it quad-state.

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About