demerphq wrote:
> 2010/1/28 Dave Mitchell <davem@iabyn.com>:
>> On Wed, Jan 27, 2010 at 08:39:55PM +0100, demerphq wrote:
>>> If you are comfortable with that as far as i recall its actually
>>> pretty simple. There is a list of characters that are to be expanded
>>> to literals or passed through. Anyway, we can/should be able to find
>>> the commit that introduced this and more or less reverse it. In
>>> theory. ;-)
>> Except that the list of characters no longer exists, and the \N (not
>> newline) feature has also been added in the meantime, which complicates
>> matters: \N, \N{0,...} have to be passed through, \N{U+...} \N{FOO} not
>> passed through.
Actually the \N{U+ form should be passed through as well.
>> Just ripping out the old commit is no longer an option.
>> (I started looking at this bug a week ago, but got side-tracked; but not
>> before I'd looked enough to realise that it was all very horrible.)
>
> Compared to the other choices?
>
>
I coded the \N stuff in toke.c, including the not-newline changes (which
aren't too hard). But I haven't done any testing, because I gave up
after a little bit trying to figure out where the commit to somewhat
reverse is that calls this code for regexes. I don't understand the
structure of the code well enough to quickly do that; I'm sure someone
out there can find it quickly.
My guess is that this also makes a bunch of code in regcomp.c
superfluous, as I bet it didn't pass through things like \x, changing
the encoding, etc. In fact, apparently the comments didn't get changed
when the commit in question got done (If they had, I probably could
have found it myself), and say what (used to) get processed by toke.c.
I presume there is code to regcomp (or somewhere else) that should be
reverted to deal with interpolated variables.
Thread Previous
|
Thread Next