2008/12/26 karl williamson <public@khwilliamson.com>: > Attached is a patch for this. The problem is that in this subroutine p may > or may not be in utf8, and the flag do_utf8 indicates which. The code > calls various functions passing both p and do_utf8, and these work. But > to_utf8_fold() expects its argument to always be in utf8, and this caused > the problem Also the av's are stored as utf8, so the memEQ would not work > correctly on a non-utf8 p even though no error message would be generated. > > The patch creates a copy of p in utf8, if necessary, and uses that even when > calling the functions that accept the do_utf8 flag, as they create > temporaries, convert to utf8, and then throw the conversion away. It is > more efficient to do the conversion once in the caller and pass that to each > routine. > > I'm not sure what to do about a test case. > > "\xc0" =~ qr/[\x{1f4}\xc0]/; > > doesn't show the problem, but > > use Test::More tests => 1; > like("\xc0", qr/[\x{1f4}\xc0]/i, 'get malformed utf8'); > > does. And it looks like none of the existing re tests use Test. Then there is probably a problem in Test::More itself ? (Is there a bug number for this?) I've tested the patch, but I would feel more comfortable with a test case. (or with a comment from Yves)Thread Previous | Thread Next