develooper Front page | perl.perl5.porters | Postings from May 2013

[perl #118197] perl doesn't cope with non-ASCII decimal separators

Nicholas Clark
May 27, 2013 13:13
[perl #118197] perl doesn't cope with non-ASCII decimal separators
Message ID:
# New Ticket Created by  Nicholas Clark 
# Please include the string:  [perl #118197]
# in the subject line of all future correspondence about this issue. 
# <URL: >

Perl doesn't cope with non-ASCII decimal separators.

LC_ALL=ps_AF.utf8 ./perl -Ilib -MPOSIX=:locale_h -C63 -MDevel::Peek -e 'setlocale(LC_ALL, "ps_AF.utf8"); use locale; Dump (sprintf "%g\n", 3.14)'
SV = PV(0xa1af6f8) at 0xa1c85f8
  REFCNT = 1
  PV = 0xa1cc668 "3\331\25314\n"\0
  CUR = 6
  LEN = 12

The decimal separator in "Pashto locale for Afghanistan" is U+066B,

So that's Devel::Peek showing that we have mojibake.

It isn't clear *how* to fix this. It might need another -C flag to say
"believe that the strings in the locales are in UTF-8".

If this approach makes sense, we probably should do something similar for the
strings returned by strerror()

Strangely fa_IR.utf8 now seems to use '.' as the decimal separator.
About a decade ago we identified it as being an interesting troublesome
test case as at that time it was using a multibyte separator. (Presumably
U+066B, but I don't know for sure)

Despite this bug, all tests pass in the local ps_AF.utf8

Nicholas Clark Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About