Front page | perl.perl6.users |
Postings from April 2020
Re: Using slurp to read in a utf16 file
Thread Previous
From:
Joseph Brenner
Date:
April 27, 2020 01:05
Subject:
Re: Using slurp to read in a utf16 file
Message ID:
CAFfgvXUDB72sZf0GoneukE=8iupkYe=q++EPgP+iG0bPRAmO=w@mail.gmail.com
To expand on the point a bit, doing exactly the same spurt/slurp works
with "utf8", but doing it with "utf16" fails to read the text back in:
{
my $unichar_str = # ሀⶀ䷼ꪪⲤⲎ
"\x[1200]\x[2D80]\x[4DFC]\x[AAAA]\x[2CA4]\x[2C8E]";
my $file = "/tmp/stuff_in_utf8.txt";
my $fh = $file.IO.open( :w, :enc("utf8") );
spurt $fh, $unichar_str;
my $contents = slurp( $file, :enc("utf8") );
my $huh = $contents.gist;
say "contents: $contents";
say "length: ", $contents.chars;
}
{
my $unichar_str = # ሀⶀ䷼ꪪⲤⲎ
"\x[1200]\x[2D80]\x[4DFC]\x[AAAA]\x[2CA4]\x[2C8E]";
my $file = "/tmp/stuff_in_utf16.txt";
my $fh = $file.IO.open( :w, :enc("utf16") );
spurt $fh, $unichar_str;
my $contents = slurp( $file, :enc("utf16") );
my $huh = $contents.gist;
say "contents: $contents"; # contents:
say "length: ", $contents.chars; # 0
}
The output:
contents: ሀⶀ䷼ꪪⲤⲎ
length: 6
contents:
length: 0
The file definitely has something in it, though:
wc /tmp/stuff_in_utf16.txt
0 1 14 /tmp/stuff_in_utf16.txt
cat /tmp/stuff_in_utf16.txt
\377\376^@^R\200-\374M\252\252\244,\216,
On 4/26/20, Joseph Brenner <doomvox@gmail.com> wrote:
> Looking at the documentation for slurp, it looks as though there's a
> convenient "enc" option you can use if you're not reading utf8 files.
> So I thought this would work:
>
> my $contents = slurp $file, enc => "utf16";
>
> It's not doing what I expected... Raku acts like there's nothing in
> $contents.
>
> Here's the test code I've been using:
>
> # ሀⶀ䷼ꪪⲤⲎ
> my $unichar_str =
> "\x[1200]\x[2D80]\x[4DFC]\x[AAAA]\x[2CA4]\x[2C8E]";
>
> my $file = "/home/doom/tmp/stuff_in_utf16.txt";
> my $fh = $file.IO.open( :w, :enc("utf16") );
> spurt $fh, $unichar_str;
>
> # read entire file as utf16 Str
> my $contents = slurp $file, enc => "utf16";
> my $huh = $contents.gist;
> say "contents: $contents"; # contents:
> say $contents.elems; # 1
>
Thread Previous