develooper Front page | perl.perl5.porters | Postings from August 2013

[perl #119239] started out as doc clarification needed in 'eval...but...

Thread Previous | Thread Next
From:
Linda Walsh via RT
Date:
August 16, 2013 07:13
Subject:
[perl #119239] started out as doc clarification needed in 'eval...but...
Message ID:
rt-3.6.HEAD-2552-1376637193-138.119239-15-0@perl.org
On Thu Aug 15 23:24:06 2013, ikegami@adaelis.com wrote:
> On Thu, Aug 15, 2013 at 9:38 PM, Linda Walsh via RT <
> perlbug-followup@perl.org> wrote:
> 
> > On Thu Aug 15 13:32:51 2013, ikegami@adaelis.com wrote:
> > > On Thu, Aug 15, 2013 at 3:43 PM, Linda Walsh via RT <
> > > perlbug-followup@perl.org> wrote:
> > >
> > > > Points) A) - "(to myself as much as anyone)" - use utf8 only
> applies to
> > > > source code not content strings, so my wonderings why the
> Japanese
> > > > INU(dog) YA(night) SHA(dividing point) came out "ok"
> > >
> > >
> > > $ perl -Mutf8 -E'say "\x{72AC}\x{591C}\x{53C9}"'
> > > Wide character in say at -e line 1.
> > > 犬夜叉
> >
> > ----
> > Urg...um... so the original example that I had that
> > printed "
> > $string="“犬夜叉”";
> >
> 
> You didn't use C<< use utf8; >> which means your code couldn't
> possibly
> have contained
> 
>     $string="“犬夜叉”";  # "\x{201C}\x{72AC}\x{591C}\x{53C9}\x{201D}"
> 
> It actually contains
> 
>     $string="�����";  #
> "\xE2\x80\x9C\xE7\x8A\xAC\xE5\xA4\x9C\xE5\x8F\x89\xE2\x80\x9D"
> 
> You might have saved the program in UTF-8, but you told Perl it was
> iso-8859-1 (by not using C<< use utf8; >>).
-----
You didn't read the rest of the note...

If that was true:
> Hmmm... no utf8 warnings on any of those.
> But if perl had take it as a byte-string, the 
> 
> = e7 8a ac e5 a4 9c e5 8f 89 <<--- why wouldn't those
> have been taken as latin1 (as the source wasn't listed
> as utf8), and been "encoded, internally to 
> their UTF-8 encodings?  
> I.e. "E7" = 0xc3 0xa7, 8e = 0xc2 0x9a;
> ----
----
If perl had taken that input as latin1, then why wouldn't I have seen
the wide character warning on output?

OTOH, if I add "use utf8", to this program:

#!/usr/bin/perl
use 5.6.16;
use utf8;
#use P;
use warnings;
my $name= [qw (犬夜叉)];
my $band={band => "Queensrÿche"};
printf "string=%s, len=%s\n", $name->[0], length($name->[0]);
printf "band=%s\n", $band->{band};
----
I get corrupted output:
/tmp/s.pl
Wide character in printf at /tmp/s.pl line 8.
string=犬夜叉, len=3
band=Queensr

             Ishtar:law/bin/lib> more s.pl

----
Isn't this sort of the opposite of what one would expect?

I guess I should file this under another bug, as this isn't
really the doc bug about eval scoping...







---
via perlbug:  queue: perl5 status: open
https://rt.perl.org:443/rt3/Ticket/Display.html?id=119239

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About