develooper Front page | perl.perl5.porters | Postings from August 2013

[perl #119499] $! returned with UTF-8 flag under UTF-8 locales only under 5.19.2+

Thread Next
From:
Victor Efimov via RT
Date:
August 28, 2013 17:04
Subject:
[perl #119499] $! returned with UTF-8 flag under UTF-8 locales only under 5.19.2+
Message ID:
rt-3.6.HEAD-1873-1377709460-802.119499-15-0@perl.org
Seems this is result of
https://rt.perl.org/rt3/Ticket/Display.html?id=112208 fix.

However I think fix is wrong.

1) it breaks old code, which:

a) tries to decode $! using Encode::decode and
I18N::Langinfo::langinfo(I18N::Langinfo::CODESET()) 

b) which prints error messages to screen as-is (without "binmode STDOUT
:encoding")

2) Sometimes it returns binary string (under non-utf8 locales, or when
message is ASCII-only), sometimes character string (when locale is UTF-8).

It's hard to distinct one from another. Possible solution is
utf8::is_utf8(), but use of utf8::is_utf8 advertised as a danger way.

Another solution is use Encode::decode_utf8 when locale is UTF-8 ( but
not Encode::decode("UTF-8"...) ).

Problem that this method's documentation is wrong - several people
reported this:

https://rt.cpan.org/Public/Bug/Display.html?id=87267
https://rt.cpan.org/Public/Bug/Display.html?id=61671
https://github.com/dankogai/p5-encode/pull/11
https://github.com/dankogai/p5-encode/pull/10


3) It's not documented in perllocale, perlunicode, perlvar.

4) It's not clear how it works in case of Latin-1 characters in UTF-8
locale.

On Wed Aug 28 01:52:13 2013, vsespb wrote:
> $! returned as character string under 5.19.2+ and UTF-8 locales. But as
> binary strings
> under single-byte encoding locales.
> 
> I believe this is useless and just makes it harder to decode $! value
> properly.
> 
> Also I am not sure if it will be possible to decode it when language with
> Latin-1 -only characters is set.
> 
> LANG=ru_RU LANGUAGE=ru_RU:ru LC_ALL=ru_RU.utf8 perl -MPOSIX -MDevel::Peek
> -e '$!=EACCES; Dump "$!"'
> 
> SV = PV(0x144dd80) at 0x14702a0
>   REFCNT = 1
>   FLAGS = (PADTMP,POK,pPOK,UTF8)
>   PV = 0x1468e30
> "\320\236\321\202\320\272\320\260\320\267\320\260\320\275\320\276 \320\262
> \320\264\320\276\321\201\321\202\321\203\320\277\320\265"\0 [UTF8
> "\x{41e}\x{442}\x{43a}\x{430}\x{437}\x{430}\x{43d}\x{43e} \x{432}
> \x{434}\x{43e}\x{441}\x{442}\x{443}\x{43f}\x{435}"]
>   CUR = 34
>   LEN = 40
> 
> 
> LANG=ru_RU LANGUAGE=ru_RU:ru LC_ALL=ru_RU.CP1251 LC_MESSAGES=ru_RU.CP1251
> perl -MPOSIX -MDevel::Peek -e '$!=EACCES; Dump "$!"'
> 
> SV = PV(0x1db8d80) at 0x1ddf7e0
>   REFCNT = 1
>   FLAGS = (PADTMP,POK,pPOK)
>   PV = 0x1f680d0 "\316\362\352\340\347\340\355\356 \342
> \344\356\361\362\363\357\345"\0
>   CUR = 18
>   LEN = 24




---
via perlbug:  queue: perl5 status: new
https://rt.perl.org:443/rt3/Ticket/Display.html?id=119499

Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About