demerphq wrote: > 2008/12/27 karl williamson <public@khwilliamson.com>: >> Rafael Garcia-Suarez wrote: >>> 2008/12/26 karl williamson <public@khwilliamson.com>: >>>> Attached is a patch for this. The problem is that in this subroutine p >>>> may >>>> or may not be in utf8, and the flag do_utf8 indicates which. The code >>>> calls various functions passing both p and do_utf8, and these work. But >>>> to_utf8_fold() expects its argument to always be in utf8, and this caused >>>> the problem Also the av's are stored as utf8, so the memEQ would not >>>> work >>>> correctly on a non-utf8 p even though no error message would be >>>> generated. >>>> >>>> The patch creates a copy of p in utf8, if necessary, and uses that even >>>> when >>>> calling the functions that accept the do_utf8 flag, as they create >>>> temporaries, convert to utf8, and then throw the conversion away. It is >>>> more efficient to do the conversion once in the caller and pass that to >>>> each >>>> routine. >>>> >>>> I'm not sure what to do about a test case. >>>> >>>> "\xc0" =~ qr/[\x{1f4}\xc0]/; >>>> >>>> doesn't show the problem, but >>>> >>>> use Test::More tests => 1; >>>> like("\xc0", qr/[\x{1f4}\xc0]/i, 'get malformed utf8'); >>>> >>>> does. And it looks like none of the existing re tests use Test. >>> Then there is probably a problem in Test::More itself ? >>> >>> (Is there a bug number for this?) >>> >>> I've tested the patch, but I would feel more comfortable with a test >>> case. (or with a comment from Yves) >>> >>> >> No bug number. Should I create one? >> >> I suspect that it isn't a bug in Test::More, but that it calls things >> somehow differently, which is kind of scary in itself that it perturbs the >> environment Maybe a certain class of tests shouldn't be done using Test. I >> don't know. >> >> If we don't hear from Yves in the meantime, I'll look tomorrow to see how to >> reproduce it without using Test. > > Is this a problem with casefolding unicode characters in a charclass? > > I have to admit that on reading this I dont have much to add. And my > windows box is offline these days due to a hardware failure so if im > going to debug it ill have to learn gdb finally. Which could take a > while :-) > > Yves > > > > This is turning into several threads. I'll separate out the casefolding charclass into a separate one. Duh! The reason I didn't get a malformed message without Test::More is because I forgot to turn on warnings. Attached is another patch, to add a test case.Thread Previous | Thread Next