develooper Front page | perl.perl5.porters | Postings from September 2010

Re: [perl #57234] Malformed UTF-8 after Encode::decode (utf8, andregex $2, $3

Thread Next
Ben Bullock
September 19, 2010 21:20
Re: [perl #57234] Malformed UTF-8 after Encode::decode (utf8, andregex $2, $3
Message ID:
I'm pretty sure I filed a very much simpler example of this bug after
that one (it was more than two years ago).

I don't think there was anything wrong with the utf8 etc., that is
just smoke-blowing.

On 20 September 2010 05:48, Father Chrysostomos via RT
<> wrote:
> On Tue Jul 29 19:46:08 2008, BKB wrote:
>> This is a very much simplified version of the script which tripped the
>> bug (five lines). I've also simplified the regex drastically until it
>> trips the bug. Shortening the regex from this makes it print "OK" but as
>> it stands the "Malformed UTF-8 character (fatal)" message appears.
> Thank you for your report.
> You have ‘use utf8’ in your script, which signals to perl that your
> source code is in UTF-8.
> But then you have a string containing the octets 95 B6, which is not
> valid UTF-8. This results in an invalid scalar, so perl croaks. This
> behaviour is correct.
> You do not need ‘use utf8’ if you are just *using* Unicode strings.

Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About