develooper Front page | perl.perl5.porters | Postings from September 2010

Re: [perl #57234] Malformed UTF-8 after Encode::decode (utf8, andregex $2, $3

Thread Next
From:
Ben Bullock
Date:
September 19, 2010 21:20
Subject:
Re: [perl #57234] Malformed UTF-8 after Encode::decode (utf8, andregex $2, $3
Message ID:
AANLkTi=AySsqXJDdXTPxxuv0eV5z2QzcAm1pbtvGG_vs@mail.gmail.com
I'm pretty sure I filed a very much simpler example of this bug after
that one (it was more than two years ago).

I don't think there was anything wrong with the utf8 etc., that is
just smoke-blowing.

On 20 September 2010 05:48, Father Chrysostomos via RT
<perlbug-followup@perl.org> wrote:
> On Tue Jul 29 19:46:08 2008, BKB wrote:
>> This is a very much simplified version of the script which tripped the
>> bug (five lines). I've also simplified the regex drastically until it
>> trips the bug. Shortening the regex from this makes it print "OK" but as
>> it stands the "Malformed UTF-8 character (fatal)" message appears.
>
> Thank you for your report.
>
> You have ‘use utf8’ in your script, which signals to perl that your
> source code is in UTF-8.
>
> But then you have a string containing the octets 95 B6, which is not
> valid UTF-8. This results in an invalid scalar, so perl croaks. This
> behaviour is correct.
>
> You do not need ‘use utf8’ if you are just *using* Unicode strings.
>
>

Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About