develooper Front page | perl.perl5.porters | Postings from July 2013

Re: [perl #118795] locale changes in 5.19.1 break LC_NUMERIC handling

Thread Previous | Thread Next
Karl Williamson
July 9, 2013 04:18
Re: [perl #118795] locale changes in 5.19.1 break LC_NUMERIC handling
Message ID:
On 07/08/2013 04:03 PM, John Peacock wrote:
> On 07/08/2013 09:20 AM, Leon Timmermans wrote:
>> perllocale does, though «By default, Perl ignores the current locale.
>> The use locale pragma tells Perl to use the current locale for some
>> operations.»
> That sentence doesn't mention setlocale at all, which is a completely
> different topic from respecting the ENV locale.  Indeed, theexample code
> in perllocale for setlocale() itself does not include 'use locale'.
> Although, to be fair, the documentation for LC_NUMERIC does.

perllocale also says, "If you want a Perl application to process and 
present your data according to a particular locale, the application code 
should include the use locale pragma"
>> The old behavior is crazy, especially the way the stringification
>> sticks to the number even after the locale is out of scope. This has
>> been discussed on this list previously.
> I am not arguing that the old behaviour was completely correct.  I am
> arguing that POSIX::setlocale() has been changed without warning. Either
> make POSIX enable 'use locale' globally or load it automatically when
> locale_h is imported (or better yet only just prior to
> setlocale/localconv being called).
> I do not think that adding documentation to POSIX is more than a thin
> layer of tissue paper over the real issue.  setlocale() has operated
> with the old (albeit broken) behaviour for 13 years; merely adding
> documentation is not sufficient, IMNSHO.  I would even consider it
> proper to emit a fatal error if setlocale() is called outside of a 'use
> locale' block.  Silent changes in behaviour are not useful to anyone.
> John

First, let me make sure there is no misunderstanding here.

The actual behavior of setlocale() has not changed.  There were no code 
changes done with regard to this involving setlocale at all.  You can 
still call setlocale() anywhere in your program, within or outside the 
scope of a 'use locale', and the underlying locale that the program 
operates under is immediately changed to correspond.  Any POSIX 
functions, for example, that interact with the OS that are called will 
return based on the changed locale.

What did change is that perl space code no longer pays attention to the 
LC_NUMERIC category outside 'use locale'.  This is the way it has always 
worked, AFAIK, for LC_COLLATE and, mostly, LC_CTYPE, and for some uses 
of LC_NUMERIC.  LC_CTYPE was fixed to completely work this way in 5.14, 
without any complaints.

It is also the way it has to work, or else one gets creepy bugs at a 
distance.  LC_NUMERIC now works like the other categories.  It is no 
longer the outlier causing bugs.  People do not expect their 'gt' and 
'sort' ops to work differently, nor do they expect their \w to match 
differently just because some module that gets loaded has done a 
setlocale.  This is the only sane method of operation.  It has long been 
settled that lexical scoping is a Good Thing™.

Nor was this done without warning to programmers.  Near the very 
beginning of perl5191delta, under "Incompatible Changes", it says

"Locale decimal point character no longer leaks outside of "use locale" 

"This is actually a bug fix, but some code has come to rely on the bug
being present, so this change is listed here.  The current locale that
the program is running under is not supposed to be visible to Perl code
except within the scope of a "use locale".  However, until now under
certain circumstances, the character used for a decimal point (often a
comma) leaked outside the scope.  If your code is affected by this
change, simply add a "use locale"."

But perhaps you meant that there is no run-time warning or error about 
calling setlocale outside the scope of 'use locale'.  Such a warning 
would only be appropriate if setlocale were called with LC_NUMERIC, as 
there is no change in any behavior here for the other categories. 
Further, some uses of LC_NUMERIC prior to this change did obey scoping 
rules, so the warning would be wrong in some cases even here.  Also, it 
is legitimate to call setlocale with no intention of ever doing 'use 
locale' if one is just using POSIX functions, though that is unlikely.

So, it's not that the behavior of the system has changed monolithically 
around setlocale, 'use locale'.  It's that things now behave more 
consistently across the possible parameters to setlocale().

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About