develooper Front page | perl.perl5.porters | Postings from December 2008

Re: PATCH Fix malformed utf8 in regexec.c only shows with Test::More

Thread Previous | Thread Next
karl williamson
December 26, 2008 15:23
Re: PATCH Fix malformed utf8 in regexec.c only shows with Test::More
Message ID:
Rafael Garcia-Suarez wrote:
> 2008/12/26 karl williamson <>:
>> Attached is a patch for this.  The problem is that in this subroutine p may
>>  or may not be in utf8, and the flag do_utf8 indicates which.  The code
>> calls  various functions passing both p and do_utf8, and these work.  But
>> to_utf8_fold() expects its argument to always be in utf8, and this caused
>> the problem  Also the av's are stored as utf8, so the memEQ would not work
>> correctly on a non-utf8 p even though no error message would be generated.
>> The patch creates a copy of p in utf8, if necessary, and uses that even when
>> calling the functions that accept the do_utf8 flag, as they create
>> temporaries, convert to utf8, and then throw the conversion away.  It is
>> more efficient to do the conversion once in the caller and pass that to each
>> routine.
>> I'm not sure what to do about a test case.
>> "\xc0" =~ qr/[\x{1f4}\xc0]/;
>> doesn't show the problem, but
>> use Test::More tests => 1;
>> like("\xc0", qr/[\x{1f4}\xc0]/i, 'get malformed utf8');
>> does.  And it looks like none of the existing re tests use Test.
> Then there is probably a problem in Test::More itself ?
> (Is there a bug number for this?)
> I've tested the patch, but I would feel more comfortable with a test
> case. (or with a comment from Yves)
No bug number.  Should I create one?

I suspect that it isn't a bug in Test::More, but that it calls things 
somehow differently, which is kind of scary in itself that it perturbs 
the environment  Maybe a certain class of tests shouldn't be done using 
Test.  I don't know.

If we don't hear from Yves in the meantime, I'll look tomorrow to see 
how to reproduce it without using Test.

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About