develooper Front page | perl.perl5.porters | Postings from November 2010

Re: [PATCH] Re: [perl #70998] Warning: Malformed UTF-8 character insubstitution operation

Thread Previous
From:
demerphq
Date:
November 2, 2010 03:48
Subject:
Re: [PATCH] Re: [perl #70998] Warning: Malformed UTF-8 character insubstitution operation
Message ID:
AANLkTin0mehyBp=yzEehoOBqVwQ3_iM1FkL3+AT8O2Ak@mail.gmail.com
2009/12/27 Father Chrysostomos <sprout@cpan.org>:
>
> On Dec 21, 2009, at 2:46 AM, demerphq wrote:
>
>> 2009/12/20 Father Chrysostomos <sprout@cpan.org>:
>>>
>>> This bug (which was caused by change 28373/07be1b8) can be reduced to:
>>>
>>> qq{\x{30ab}} =~ /\xab|\xa9/;
>>>
>>> Malformed UTF-8 character (unexpected continuation byte 0xab, with no
>>> preceding start byte) in pattern match (m//) at
>>> /Users/sprout/Perl/5.8.7-regressions/70998.d/70998 copy line 2.
>>>
>>> 30ab is e3 82 ab in UTF-8.
>>>
>>> The trie optimisation in S_find_byclass (added by change 28373/07be1b8)
>>> searches in e3 82 ab for a char matching [\xab\xa9], and sets 2 (\xab) as
>>> its starting position. The code it passes control to then stumbles across
>>> this ‘lone’ \xab.
>>>
>>> The attached patch fixes this.
>>
>>
>> Is this a problem/tested in blead?
>>
>> I have a feeling that when i disabled certain trie functionality i
>> "fixed" this one.
>>
>> Before this gets applied id like to know more.
>>
>
> I hope this is a sufficient answer:
>
> $ perl5.11.3 -e 'qq{\x{30ab}} =~ /\xab|\xa9/'
> Malformed UTF-8 character (unexpected continuation byte 0xab, with no
> preceding start byte) in pattern match (m//) at -e line 1.
>
> Whether mine is the best way to fix this I cannot say.

Thanks. I decided to fix it a different way, but I applied your test
code. Patches pushed to blead:

commit d085b4908fc15b9d48cec72b473eec9d0870015b
Author: Yves Orton <demerphq@gmail.com>
Date:   Tue Nov 2 11:29:18 2010 +0100

    Fix RT-70998: qq{\x{30ab}} =~ /\xab|\xa9/ produces warnings

commit aca53033b83659a859fd8408e90d33b842414c39
Author: Father Chrysostomos <sprout@cpan.org>
Date:   Tue Nov 2 11:28:33 2010 +0100

    Add test for rt-70998: qq{\x{30ab}} =~ /\xab|\xa9/ produces warnings

-- 
perl -Mre=debug -e "/just|another|perl|hacker/"

Thread Previous


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About