develooper Front page | perl.unicode | Postings from June 2011

Enumerating all canonically equivalent strings

June 20, 2011 16:22
Enumerating all canonically equivalent strings
Message ID:
Does there exist a standard module or function that, given a Combining 
Character Sequence (or, more generally, an arbitrary Unicode text 
string), will generate a list of all canonically equivalent strings?

For example, if given the character U+1EAD, I'd like to get back a list 
of all these canonically equivalent sequences:

0061 0302 0323
0061 0323 0302
00E2 0323
1EA1 0302

(I don't particularly care whether the interface is in terms of arrays 
of USVs or utf strings.)

Some years ago I created such a module for my own use (I called it 
Unicode::MakeEquivalents), and am now wondering whether there exists a 
standard solution to this problem (so I can abandon my own stuff), or 
whether I should pursue adding this functionality to CPAN somewhere.


Bob Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About