develooper Front page | perl.perl5.porters | Postings from December 2009

[perl #71740] 5.11.3 breaks Unicode::Normalize

From:
Claus Färber
Date:
December 30, 2009 04:38
Subject:
[perl #71740] 5.11.3 breaks Unicode::Normalize
Message ID:
rt-3.6.HEAD-1505-1262126114-358.71740-75-0@perl.org
# New Ticket Created by  Claus Färber 
# Please include the string:  [perl #71740]
# in the subject line of all future correspondence about this issue. 
# <URL: http://rt.perl.org/rt3/Ticket/Display.html?id=71740 >


Hi!

perl-5.11.3 breaks Unicode::Normalize: For example, Unicode::Normalize::NFKC("\x{2000}") will produce "\x{20}\x{5}" instead of "\x{20}". (Other code points will obviously suffer from the same problem.)

lib/unicode/mktables produces lib/unicode/Decomposition.pl, containing this (the comment is supposed to indicate the number of code points in the range 2002..2006):
...
| return << 'EOF';
...
| 2000            2002
| 2001            2003
| 2002    2006    <compat> 0020 # [5]
...

cpan/Unicode-Normalize/mkheader then fails to parse this correctly, interpreting "# [5]" as a U+0005.

I would already have written a patch, however, I'm unsure whether this should be fixed in perl or in Unicode::Normalize.

Claus

NB: I've already filed a bug report against Unicode::Normalize at <http://rt.cpan.org/Public/Bug/Display.html?id=53197>.




nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About