develooper Front page | perl.perl5.porters | Postings from January 2010

Re: [perl #56444] (5.12 blocker) delayed interpolation of \N{...}charnames escapes in regexes in perl 5.9.x and later causes breakage

Thread Previous | Thread Next
From:
karl williamson
Date:
January 27, 2010 19:25
Subject:
Re: [perl #56444] (5.12 blocker) delayed interpolation of \N{...}charnames escapes in regexes in perl 5.9.x and later causes breakage
Message ID:
4B610347.2040904@khwilliamson.com
demerphq wrote:
> 2010/1/28 Dave Mitchell <davem@iabyn.com>:
>> On Wed, Jan 27, 2010 at 08:39:55PM +0100, demerphq wrote:
>>> If you are comfortable with that as far as i recall its actually
>>> pretty simple. There is a list of characters that are to be expanded
>>> to literals or passed through. Anyway, we can/should be able to find
>>> the commit that introduced this and more or less reverse it. In
>>> theory. ;-)
>> Except that the list of characters no longer exists, and the \N (not
>> newline) feature has also been added in the meantime, which complicates
>> matters: \N, \N{0,...} have to be passed through, \N{U+...} \N{FOO} not
>> passed through. 

Actually the \N{U+ form should be passed through as well.

>> Just ripping out the old commit is no longer an option.
>> (I started looking at this bug a week ago, but got side-tracked; but not
>> before I'd looked enough to realise that it was all very horrible.)
> 
> Compared to the other choices?
> 
>

I coded the \N stuff in toke.c, including the not-newline changes (which 
aren't too hard).  But I haven't done any testing, because I gave up 
after a little bit trying to figure out where the commit to somewhat 
reverse is that calls this code for regexes.  I don't understand the 
structure of the code well enough to quickly do that; I'm sure someone 
out there can find it quickly.

My guess is that this also makes a bunch of code in regcomp.c 
superfluous, as I bet it didn't pass through things like \x, changing 
the encoding, etc.  In fact, apparently the comments didn't get changed 
when the commit in question got done  (If they had, I probably could 
have found it myself), and say what (used to) get processed by toke.c. 
I presume there is code to regcomp (or somewhere else) that should be 
reverted to deal with interpolated variables.

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About