Arn't we over complicating this, or have I misunderstood something. From http://www.unicode.org/reports/tr44/#Matching_Rules Character Names Unicode character names constitute a special case. Formally, they are values of the Name property. While each Unicode character name for an assigned character is guaranteed to be unique, names are assigned in such a way that the presence or absence of spaces cannot be used to distinguish them. Furthermore, implementations sometimes create identifiers from Unicode character names by inserting underscores for spaces. For best results in comparing Unicode character names, use loose matching rule UAX44-LM2. /*UAX44-LM2.*/ Ignore case, whitespace, underscore ('_'), and all medial hyphens except the hyphen in U+1180 HANGUL JUNGSEONG O-E. * "zero-width space" is equivalent to "ZERO WIDTH SPACE" or "zerowidthspace" * "character -a" is /not/ equivalent to "character a" So the code in mktables needs to create names that have had the spaces underscores and medial hyphens removed, except as noted and the result then uppercased. When processing the \N{ whatever } all we have to do is follow the above rules to generate a normalized name. I don't know where in the perl C code \N{} is processed but I hope it's not too difficult to process this; certainty it could be written in Perl very easily. JohnThread Previous | Thread Next