develooper Front page | perl.perl5.porters | Postings from January 2015

RFC: How should locale work?

Karl Williamson
January 29, 2015 19:24
RFC: How should locale work?
Message ID:
No one has responded to my follow up in

so I think I should explicitly ask for feedback.

The ticket is that the behavior of POSIX::localeconv() has changed in 
v5.21.  The behavior is now consistent with all the (non-deprecated) 
POSIX functions, whereas before things were inconsistent.

But the root cause of the problem is different.

All C programs have an underlying locale; you can't get away from that. 
  Since Perl is written in C, it has an underlying locale.  The 
underlying locale of a C program is the "C" locale unless changed by a 
setlocale() call.  Therefore, people who are writing close to the 
hardware "metal" using the POSIX functions will tend to think that this 
is the case in their Perl program.

However, one of the very first things Perl does at startup is to do a 
setlocale() to the locale given by the external environment variables in 
effect (such as LANG) at that time.  Thus someone writing close to the 
metal gets different behavior than if it were just a plain C program.

I looked at the few modules that so far have been adversely affected by 
this localeconv() change.  I couldn't understand the mind-set of the 
author until it dawned on me reading this ticket.

They are expecting the locale to be C until an explicit setlocale has 
been done, and then they are expecting the locale to be that one.

The difference in the expected behavior and what they get is solely the 
start-up of Perl switching the locale to the environment instead of 
remaining in C.

Note that this matters only to someone using the POSIX functions.  The 
underlying locale is not exposed to pure Perl code, except through the 
POSIX module, or within the scope of the 'locale' pragma (or if a regex 
is compiled with /l).

The only way to accommodate these authors' expectations that I can see 
would be for an OP to be generated when entering a region where locale 
is honored.  That is, if a locale->import() is done, the parser would 
generate a new OP.  I suspect this is impossible or very hard to do, and 
that is why the current design of setting the locale at startup was 
done.  And, except for the POSIX functions, it works.

So I'm not sure what to do.  It's late in the 5.22 cycle to be making 
changes.  There have been only a few reports of breakage, and we now 
have consistent behavior of all the POSIX functions, even if it doesn't 
meet the expectations of some cpan authors.  I certainly need to revise 
perllocale.pod to highlight this.

But if we could find a way to meet these not-unreasonable expectations, 
we could back out the localeconv change (and any others an audit turns 
up) for 5.22, and then change early in 5.23.  But I actually don't think 
it's possible.

Feedback welcome Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About