develooper Front page | perl.perl5.porters | Postings from July 2013

POSIX::foo() ignore UTF8ness

Thread Next
From:
Karl Williamson
Date:
July 15, 2013 22:14
Subject:
POSIX::foo() ignore UTF8ness
Message ID:
51E47438.9020605@khwilliamson.com
Most if not all the POSIX functions that take a string scalar as input 
treat that scalar as bytes, ignoring the UTF8 flag.  This is clearly 
wrong.  But what to do?

One possibility raised on irc is to downgrade to bytes first.  If the 
downgrade fails, return FALSE and raise a 'wide character' warning.

Another possibility is to instead call the libc wide character 
equivalent function using the code point given by the UTF-8 character. 
I believe this is more in keeping with DWIM.

For example, POSIX::isalnum("\x{100}") could fail and warn, or it could 
return the result of libc iswalnum(0x100).

If we did the latter, Configure would have to be changed to probe for 
these functions.  I think if one is present all the others of the given 
class also would be.  If not present, "failing and warning" would be done.

Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About