Tatsuhiko Miyagawa wrote:
> I was looking at perl5110delta and surprised (and a bit upset) to see
> the \d \w \s changes mentioned:
>
> I toyed with a small piece of code and seems it's not working as
> specified in delta anyway:
> http://gist.github.com/200900
>
> So apparently the delta is not correct, or delta is trying to specify
> what *will* be changed but not done yet?
>
Yes, the delta is not correct, but gives the current plan, so that
should be what happens.
> Anyway, I have tons of scripts that rely on \d matching Japanese
> numbers and \s matches with full-width space etc. Being able to have a
> pragma to enable/disable the new behavior would be very nice. (I
> understand I can start rewriting those \d to like \p{IsDigit} to be
> forward compatbile, though)
>
Note that the 'Is' is optional. The chart in the delta gives the
mappings for \s and \w as well. Note that if you can accept a vertical
tab in \s, that \p{Space} is shorter.
There are plans for a pragma for other unicode incompatibilities, and a
git branch that includes the beginnings of one: "use legacy". I had
thought that these changes could be controlled by a pragma, and I hope
that it is this one.
> On Thu, Oct 1, 2009 at 6:09 PM, karl williamson <public@khwilliamson.com> wrote:
>> demerphq wrote:
>>> 2009/9/30 karl williamson <public@khwilliamson.com>:
>>>> I had thought in our discussion last year that we had determined that
>>>> these
>>>> should match only in the ASCII range. And so, I thought that when Yves
>>>> flipped the switch on the \p{Posix} matches, that these would change as
>>>> well, but that isn't the case:
>>>> perl -E "say chr(0x2028) =~ /\s/"
>>>> 1
>>>>
>>>> in blead.
>>> Im inclined to say it just slipped me by. Ill poke it with a stick
>>> when i get a chance.
>>>
>>>> If I'm wrong about the agreement, I would like to start another
>>>> discussion,
>>>> and my initial position is that they should only match in the ASCII
>>>> range.
>>> Agreed.
>> Just to be precise about it, I neglected to mention that my statement was
>> meant only to apply in the absence of a "use locale", and whatever the base
>> C library routines do on an EBCDIC system. I wasn't advocating changing the
>> behavior under those circumstances.
>>
>
>
Thread Previous
|
Thread Next