Front page | perl.perl5.porters |
Postings from January 2016
Re: [perl #127288] [PATCH] I18N::Langinfo doesn't set UTF8 flag
Thread Previous
From:
Karl Williamson
Date:
January 27, 2016 06:24
Subject:
Re: [perl #127288] [PATCH] I18N::Langinfo doesn't set UTF8 flag
Message ID:
56A8626A.5070300@khwilliamson.com
On 01/16/2016 05:15 AM, Niko Tyni (via RT) wrote:
> # New Ticket Created by Niko Tyni
> # Please include the string: [perl #127288]
> # in the subject line of all future correspondence about this issue.
> # <URL: https://rt.perl.org/Ticket/Display.html?id=127288 >
>
>
> This is a bug report for perl from Niko Tyni <ntyni@debian.org>,
> generated with the help of perlbug 1.40 running under perl 5.23.7.
>
> I18N::Langinfo::langinf() can return UTF-8 strings but doesn't set
> the UTF8 flag on them.
>
> LC_ALL=fr_FR.UTF-8 perl -MDevel::Peek -MI18N::Langinfo=langinfo,MON_12 -e 'Dump langinfo(MON_12())'
> SV = PV(0x1173b20) at 0x1172f30
> REFCNT = 1
> FLAGS = (TEMP,POK,pPOK)
> PV = 0x118e960 "d\303\251cembre"\0
> CUR = 9
> LEN = 11
>
> The attached somewhat clumsy set of two patches fixes this, but perhaps
> it's time to make something like __is_cur_LC_category_utf8() more visible?
>
> (This was prompted by Time-Format test suite starting to fail on Perl
> 5.22 in non-English locales, presumably since commit 9717af6d049902fc8
> which fixed POSIX::strftime() in a similar way. The Time-Format test
> suite compares POSIX::strftime() and I18N::Langinfo results. See
> https://bugs.debian.org/811104 )
I'm not sure about the best way to go about this. I'm still reluctant
to make is_cur_LC_category_utf8() publicly accessible. But maybe it's
been out there long enough to do so. And I can't think of any way to do
this in XS code without exposing something that's not documented.
(In pure perl, within the scope of 'use locale', you can do fc(\xdf) and
see if the result is 'ss' or not, which it only will be if perl thinks
the LC_CTYPE locale is UTF-8.)
So if someone has an opinion about this, chime in.
Here's a couple problems with your patch.
It is possible for the various locale categories to be in different
locales. For example, LC_TIME could be in a UTF-8 locale, while
LC_NUMERIC is not. Thus a fully accurate program would need to know the
category of the field being requested, and call
is_cur_LC_category_utf8() with that category. Most fields that
nl_langinfo(3) returns on my Linux box are LC_TIME, but not all. So your
patch could be wrong if a different field, like the radix character, is
being requested. Since the fields are not standardized, I don't know
how to make it completely accurate, but one could know the few
exceptions and assume everything else is LC_TIME. There is a UTF-8
locale on the dromedary machine that has a radix character that is only
expressible in UTF-8. And there are plenty of currency signs in
LC_MONETARY that are UTF-8 as well.
For the .t file, looking in Configure for whether setlocale and POSIX,
etc, are available or not is not the full story. I have tried to
standardize in the 5.23 series all finding of available locales to use
functions from t/loc_tools.pl. One of them is locales_enabled() which
knows about more things affecting if locale handling is available than
any of the tests that I converted to use it. Since your test is in /ext,
it can freely use core tools, and it's simpler to use this anyway.
>
> ---
> Flags:
> category=library
> severity=low
> Type=Patch
> PatchStatus=HasPatch
> module=I18N::Langinfo
> ---
> Site configuration information for perl 5.23.7:
>
> Configured by niko at Sat Jan 16 13:32:56 EET 2016.
>
> Summary of my perl5 (revision 5 version 23 subversion 7) configuration:
> Local Commit: 67271c83612f3ab129c8326d07ca55104a2f23f8
> Ancestor: dff8a39d0194aa70bc091208b6a62bffe2148f0a
> Platform:
> osname=linux, osvers=4.3.0-1-amd64, archname=x86_64-linux
> uname='linux estella 4.3.0-1-amd64 #1 smp debian 4.3.3-2 (2015-12-17) x86_64 gnulinux '
> config_args='-Dusedevel -des'
> hint=previous, useposix=true, d_sigaction=define
> useithreads=undef, usemultiplicity=undef
> use64bitint=define, use64bitall=define, uselongdouble=undef
> usemymalloc=n, bincompat5005=undef
> Compiler:
> cc='cc', ccflags ='-fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_FORTIFY_SOURCE=2',
> optimize='-O2',
> cppflags='-fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include'
> ccversion='', gccversion='5.3.1 20151219', gccosandvers=''
> intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678, doublekind=3
> d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16, longdblkind=3
> ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
> alignbytes=8, prototype=define
> Linker and Libraries:
> ld='cc', ldflags =' -fstack-protector-strong -L/usr/local/lib'
> libpth=/usr/local/lib /usr/lib/gcc/x86_64-linux-gnu/5/include-fixed /usr/include/x86_64-linux-gnu /usr/lib /lib/x86_64-linux-gnu /lib/../lib /usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib /usr/local/lib /usr/lib/gcc/x86_64-linux-gnu/5/include-fixed /usr/include/x86_64-linux-gnu /usr/lib
> libs=-lpthread -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc -lgdbm_compat
> perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
> libc=libc-2.21.so, so=so, useshrplib=false, libperl=libperl.a
> gnulibc_version='2.21'
> Dynamic Linking:
> dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
> cccdlflags='-fPIC', lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector-strong'
>
> Locally applied patches:
> b8ea0feef314dd5432298db82d9f2b8afed1442f
> 67271c83612f3ab129c8326d07ca55104a2f23f8
>
> ---
> @INC for perl 5.23.7:
> lib
> /usr/local/lib/perl5/site_perl/5.23.7/x86_64-linux
> /usr/local/lib/perl5/site_perl/5.23.7
> /usr/local/lib/perl5/5.23.7/x86_64-linux
> /usr/local/lib/perl5/5.23.7
> .
>
> ---
> Environment for perl 5.23.7:
> HOME=/home/niko
> LANG=en_US.UTF-8
> LANGUAGE (unset)
> LC_CTYPE=fi_FI.UTF-8
> LD_LIBRARY_PATH (unset)
> LOGDIR (unset)
> PATH=/home/niko/bin:/home/niko/bin:/home/niko/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games:/sbin:/usr/sbin:/sbin:/usr/sbin
> PERL_BADLANG (unset)
> SHELL=/bin/zsh
>
Thread Previous