Front page | perl.perl5.porters |
Postings from November 2008
Re: [RFC] Regular expression character classes and unicode.
Thread Previous
|
Thread Next
From:
demerphq
Date:
November 27, 2008 04:30
Subject:
Re: [RFC] Regular expression character classes and unicode.
Message ID:
9b18b3110811270430t1dc0e6c2rbd5ee32899bad338@mail.gmail.com
2008/11/27 karl williamson <public@khwilliamson.com>:
> Rafael Garcia-Suarez wrote:
>>
>> 2008/11/11 karl williamson <public@khwilliamson.com>:
>>>
>>> But there are problems with pragmas. As I've discovered, the charnames
>>> pragma goes away in an eval. That this should happen was not obvious to
>>> me,
>>> and I suspect not so to the average Perl progammer. It doesn't DWIM, and
>>> I'm not convinced it is the right thing to do. It means that
>>> complementing
>>> the default from release to release can cause programs to have to add
>>> pragma
>>> calls to their evals.
>>
>> To me it's a bug. Actually that might be caused by charnames storing
>> an arrayref in %^H. But I don't reproduce it with 5.10 (the script
>> below works). Do you have some test code ?
>>
>> use strict;
>> use warnings;
>> use 5.01;
>> use charnames 'greek';
>> say "\N{sigma} is Greek sigma";
>> eval { say "\N{sigma} is Greek sigma"; };
>> eval ' say "\N{sigma} is Greek sigma"; ';
>>
>>
> I can't get it to fail now either, except in what prompted my original
> statement:
> use charnames ':full';
> use Test::More tests => 2;
> like('A', '/\N{U+0041}/');
> like('A', '/\N{LATIN CAPITAL LETTER A}/');
>
> The first succeeds and the second fails. The response was I should be using
> qr// instead of single quotes. But it is odd that one works and one fails.
It is a consequence of the charnames pragma being lexically scoped.
Since you passed in a string, the pattern will be compiled in the
lexical scope of the Test::More/Test::Builder framework where the
pragma is apparently not in effect.
The \N{U+0041} style notation on the other hand doesn't require
charnames to be in effect, as it is just a fancy way to write \x{41}.
When passed in as a qr// object the compilation will have already been
done, in the scope of the pragma, which produces a compiled form that
no longer needs charnames. Thus when it is executed in
Test::More/Test::Builder it works as expected.
I don't know why charnames aren't always enabled, or at least load on
demand. Might be worth digging in the archives to find out why it
wasn't done that way originally, probably backwards compatibility, but
it doesn't make a lot of sense making it an error when it could be
easily converted to load on demand.
I think the new mode you are working on could enable charnames always
as well, and that would sort that problem. :-)
> Earlier, before I converted to using Test, I was doing an eval myself and
> getting a failure message that came from these lines in regcomp.c:
> vFAIL2("Constant(\\N{%s}) unknown: "
> "(possibly a missing \"use charnames ...\")",
> when in fact I did have the appropriate "use charnames" in my program. That
> is why I surmised the problem was an eval.
Id have to see the code to explain that one.
>
> I also know that in 5.8, I had to put the appropriate "use" statement inside
> my evals to get \K in re's (it was in a cpan module at the time) to work.
>
> So I don't know. I guess there are other factors involved.
>
> But, for my pragma enabling the latin1 semantics behavior, I was planning to
> use a bit in $^H, just like "use bytes" does. The documentation says an
> eval gets a copy of this scalar, pushed at its beginning. It says the same
> about %^H. Your comment above leads me to think this isn't how it works.
I think Rafael meant that %^H isn't a true hash. Internally its
implemented differently, I think as some sort of linked list. It is
expected to have few values, and they aren't allowed to be complex,
strings or integers only iirc.
As for $^H if there are hint bits available to be used then what you
described sounds fine. The purpose of %^H is to allow user defined
pragmata and pragmaticly scoped storage of more complex data than
bits. Its not unusual to find pointers to complex data structures
stored in it as integers, (possibly stringified, not sure).
Yves
--
perl -Mre=debug -e "/just|another|perl|hacker/"
Thread Previous
|
Thread Next