develooper Front page | perl.perl5.porters | Postings from December 2008

Re: PATCH Fix malformed utf8 in regexec.c

Thread Previous | Thread Next
From:
Rafael Garcia-Suarez
Date:
December 26, 2008 14:42
Subject:
Re: PATCH Fix malformed utf8 in regexec.c
Message ID:
b77c1dce0812261442s5ebedf21k1a413d4762e151cc@mail.gmail.com
2008/12/26 karl williamson <public@khwilliamson.com>:
> Attached is a patch for this.  The problem is that in this subroutine p may
>  or may not be in utf8, and the flag do_utf8 indicates which.  The code
> calls  various functions passing both p and do_utf8, and these work.  But
> to_utf8_fold() expects its argument to always be in utf8, and this caused
> the problem  Also the av's are stored as utf8, so the memEQ would not work
> correctly on a non-utf8 p even though no error message would be generated.
>
> The patch creates a copy of p in utf8, if necessary, and uses that even when
> calling the functions that accept the do_utf8 flag, as they create
> temporaries, convert to utf8, and then throw the conversion away.  It is
> more efficient to do the conversion once in the caller and pass that to each
> routine.
>
> I'm not sure what to do about a test case.
>
> "\xc0" =~ qr/[\x{1f4}\xc0]/;
>
> doesn't show the problem, but
>
> use Test::More tests => 1;
> like("\xc0", qr/[\x{1f4}\xc0]/i, 'get malformed utf8');
>
> does.  And it looks like none of the existing re tests use Test.

Then there is probably a problem in Test::More itself ?

(Is there a bug number for this?)

I've tested the patch, but I would feel more comfortable with a test
case. (or with a comment from Yves)

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About