develooper Front page | perl.unicode | Postings from June 2011

Enumerating all canonically equivalent strings

From:
BobH
Date:
June 20, 2011 16:22
Subject:
Enumerating all canonically equivalent strings
Message ID:
20110620215113.16343.qmail@lists-nntp.develooper.com
Does there exist a standard module or function that, given a Combining 
Character Sequence (or, more generally, an arbitrary Unicode text 
string), will generate a list of all canonically equivalent strings?

For example, if given the character U+1EAD, I'd like to get back a list 
of all these canonically equivalent sequences:

0061 0302 0323
0061 0323 0302
00E2 0323
1EA1 0302
1EAD

(I don't particularly care whether the interface is in terms of arrays 
of USVs or utf strings.)

Some years ago I created such a module for my own use (I called it 
Unicode::MakeEquivalents), and am now wondering whether there exists a 
standard solution to this problem (so I can abandon my own stuff), or 
whether I should pursue adding this functionality to CPAN somewhere.

Suggestions?

Bob




nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About