develooper Front page | perl.perl5.porters | Postings from December 2011

[perl #107326] perl's unicode conversion fails when iconv succeeds

Thread Next
From:
Father Chrysostomos via RT
Date:
December 30, 2011 11:00
Subject:
[perl #107326] perl's unicode conversion fails when iconv succeeds
Message ID:
rt-3.6.HEAD-14510-1325271623-1300.107326-15-0@perl.org
On Fri Dec 30 10:41:46 2011, LAWalsh wrote:
> 
> This is a bug report for perl from perl-diddler@tlinx.org,
> generated with the help of perlbug 1.39 running under perl 5.12.3.
> 
> 
> -----------------------------------------------------------------
> [Please describe your issue here]
> 
> Was looking at ways to do upper/lower case compare, and bumped into
> piconv as being a 'drop in replacement for "iconv"'.  So I decided to try
> it thinking it would be a 'hoot' if it was faster.
> 
> Rather than faster, it choked at the beginning of my 98M test file
> (i.e. I truncated the file to the first few lines, 672 bytes), which
> reproduces the problem just fine .. Tr�s sad...
> 

You‘re right:

$ piconv5.15.6 -f utf16 -t utf-8 /Users/sprout/Downloads/test.in
UTF-16:Unrecognised BOM d at
/usr/local/lib/perl5/5.15.6/darwin-thread-multi-2level/Encode.pm line
196, <$ifh> line 2.

The file begins with <FF><FE>.

If I use utf-16le explicitly, it does the first line correctly, but
quickly switches to Chinese, which means it’s off by one byte.  If I use
utf-16be explicitly, the first line is in Chinese.

This is part of the Encode distribution, for which CPAN is upstream, so
I’m forwarding this to the CPAN ticket.

-- 

Father Chrysostomos


---
via perlbug:  queue: perl5 status: new
https://rt.perl.org:443/rt3/Ticket/Display.html?id=107326


Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About