develooper Front page | perl.perl5.porters | Postings from August 2010

Re: RFC: space vs. time vs. functionality in \N{name} loose matching

Thread Previous | Thread Next
From:
John Imrie
Date:
August 1, 2010 03:14
Subject:
Re: RFC: space vs. time vs. functionality in \N{name} loose matching
Message ID:
4C55489A.2030607@virginmedia.com
On 07/30/10 21:17, karl williamson wrote:
> The problem is that we have to look-up in both directions.  viacode()
> takes a code point number and returns the official Unicode name.  We
> want that official name to have the correct spaces and hyphens.  We
> don't want it to be "ZEROWIDTHSPACE", for example.  The only
> reasonable way to do this is to have the official name stored
> correctly.  That means we have to have a table with all the correct
> official names. There's no getting around that.
>
How about storing the loose matching name against the code point instead
of the official name. This gives loose matching via \N{}. Then making
viacode() a two step process. First find the lose name and then use that
as a key to a lookup for the official name.

As has been said elsewhere viacode() is not used that oftern and the
code to official name could be cashed once used. As a number of the
names can be programatically reconstituted from the loose name, ie CJK
COMPATIBILITY IDEOGRAPH-2F801, calls for these could be intercepted and
memory saved by not having them in the table.

John


Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About