develooper Front page | perl.perl5.porters | Postings from August 2013

[perl #119239] started out as doc clarification needed in 'eval...but...

Thread Previous | Thread Next
From:
Linda Walsh via RT
Date:
August 16, 2013 01:38
Subject:
[perl #119239] started out as doc clarification needed in 'eval...but...
Message ID:
rt-3.6.HEAD-2552-1376617089-17.119239-15-0@perl.org
On Thu Aug 15 13:32:51 2013, ikegami@adaelis.com wrote:
> On Thu, Aug 15, 2013 at 3:43 PM, Linda Walsh via RT <
> perlbug-followup@perl.org> wrote:
> 
> > Points) A) - "(to myself as much as anyone)" - use utf8 only applies to
> > source code not content strings, so my wonderings why the Japanese
> > INU(dog) YA(night) SHA(dividing point) came out "ok"
> 
> 
> $ perl -Mutf8 -E'say "\x{72AC}\x{591C}\x{53C9}"'
> Wide character in say at -e line 1.
> 犬夜叉

----
Urg...um... so the original example that I had that 
printed "
$string="“犬夜叉”";
didn't flag those as either wide characters nor it it flag it as an
error for not having "use utf-8;" in the sourced yet having utf-8
in the source, which was interpreted as a byte-string, and output
as a byte string, thus no warning from perl.

So if I used hex it would fail, but I don't need
the

   "use utf8;"

as Ricardo included which in this example is a red-herring?
(That a bit confusing)....

But:
perl -e 'use P;

my $name= [qw (犬夜叉)];
my $band={band => "Queensrÿche"};

P "string=%s", $name;
P "band=%s", $band;

{ use feature "say";
  say "string=%s", $name;
  say "band=%s", $band;
}
'
string=["犬夜叉"]
band={band=>"Queensrÿche"}
string=%sARRAY(0x1d8fa68)
band=%sHASH(0x1daf018)
---
Hmmm... no utf8 warnings on any of those.
But if perl had take it as a byte-string, the 

= e7 8a ac e5 a4 9c e5 8f 89 <<--- why wouldn't those
have been taken as latin1 (as the source wasn't listed
as utf8), and been "encoded, internally to 
their UTF-8 encodings?  
I.e. "E7" = 0xc3 0xa7, 8e = 0xc2 0x9a;
----
But if I try to use 'ÿ' in an identifier as in 
changing 'band' to bandÿ I get:

Unrecognized character \xC3; marked by <-- HERE after my $band<-- HERE
near column 9 at -e line 4.

Isn't that inconsistent?  The program source seems to be
taken as UTF8 encoded if it is in a string, but not ...

What do I get if I read from <package::DATA>?  Would the UTF8
encoded strings be read as UTF8 ... I'm guessing not?

----------------------

Note -- original unclarity regarding scope of what is affected
by eval "stuff here" or 'stuff here', still exists...

Though the utf8 stuff doesn't seem entirely consistent now that you
mention it..











---
via perlbug:  queue: perl5 status: open
https://rt.perl.org:443/rt3/Ticket/Display.html?id=119239

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About