develooper Front page | perl.perl5.porters | Postings from October 2003

[perl #24077] 5.8.1 Unicode: CR added to \n (in Windows) is 1-byte despite UCS-2LE

Thread Next
From:
Phill Wolf
Date:
October 1, 2003 21:02
Subject:
[perl #24077] 5.8.1 Unicode: CR added to \n (in Windows) is 1-byte despite UCS-2LE
Message ID:
rt-24077-65501.16.7253294538655@rt.perl.org
# New Ticket Created by  Phill Wolf 
# Please include the string:  [perl #24077]
# in the subject line of all future correspondence about this issue. 
# <URL: http://rt.perl.org/rt2/Ticket/Display.html?id=24077 >


Writing a "Unicode" (little-endian) text file in Windows, Perl
corrupts the byte stream by writing 1-byte carriage-returns rather
than 2-byte.

 require v5.8.1;
 use charnames ('BYTE ORDER MARK');
 open(FH, ">:encoding(UCS-2LE)", "malformed.txt");
 print FH "\N{BYTE ORDER MARK}";
 print FH "a\n";
 print FH "b\n";
 close(FH);

Debug shows the following bytes in the file:

 FE FF 61 00 0D 0A 00 62-00 0D 0A 00   ..a....b....

Note how 0D isn't getting a trailing 00 byte.





Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About