Front page | perl.perl5.porters |
Postings from April 2012
Re: unicode question
Thread Previous
|
Thread Next
From:
Brian Fraser
Date:
April 25, 2012 23:49
Subject:
Re: unicode question
Message ID:
CA+nL+nahM0XnNvLowRY=SKwTX12HesSWPvJOi4N7mi980kEJJw@mail.gmail.com
On Thu, Apr 26, 2012 at 1:47 AM, Eric Brine <ikegami@adaelis.com> wrote:
> On Wed, Apr 25, 2012 at 11:50 PM, Brian Fraser <fraserbn@gmail.com> wrote:
>
>>
>> On Wed, Apr 25, 2012 at 10:12 PM, Linda W <perl-diddler@tlinx.org> wrote:
>>
>>> Then the statement in paragraph 1 about perl having fundamental
>>> assymetric
>>> problems, no longer applies?
>>>
>>
>> In 5.14+ under unicode_strings, that's mostly right. It's all mostly
>> treated as UTF-8, syscalls aside. See "The Unicode Bug" and "When Unicode
>> Does not Happen"
>>
>
> No, the asymmetry is still 100% there. If you pass bytes to an op which
> expects characters, the bytes will effectively be treated as iso-8859-1. If
> you pass non-bytes to print, they will be encoded using utf8.
>
>
Syscalls and I/O aside (I mistakenly omitted the latter in my previous mail
-- apologies), I think this prove you wrong:
use Devel::Peek;
my $x = "\xdf";
utf8::downgrade($x);
{
no feature 'unicode_strings';
Dump uc $x;
}
{
use feature 'unicode_strings';
Dump uc $x;
}
The whole point of unicode_strings (and unicode_eval) is making ops work on
characters transparently, regardless of the internal encoding.
Thread Previous
|
Thread Next