develooper Front page | perl.perl5.porters | Postings from August 2010

Re: RFC: space vs. time vs. functionality in \N{name} loose matching

Thread Previous | Thread Next
From:
karl williamson
Date:
August 1, 2010 09:04
Subject:
Re: RFC: space vs. time vs. functionality in \N{name} loose matching
Message ID:
4C559B14.3030804@khwilliamson.com
John Imrie wrote:
> On 07/30/10 21:17, karl williamson wrote:
>> The problem is that we have to look-up in both directions.  viacode()
>> takes a code point number and returns the official Unicode name.  We
>> want that official name to have the correct spaces and hyphens.  We
>> don't want it to be "ZEROWIDTHSPACE", for example.  The only
>> reasonable way to do this is to have the official name stored
>> correctly.  That means we have to have a table with all the correct
>> official names. There's no getting around that.
>>
> How about storing the loose matching name against the code point instead
> of the official name. This gives loose matching via \N{}. Then making
> viacode() a two step process. First find the lose name and then use that
> as a key to a lookup for the official name.
> 

I don't think I follow this.

> As has been said elsewhere viacode() is not used that oftern and the
> code to official name could be cashed once used. As a number of the
> names can be programatically reconstituted from the loose name, ie CJK
> COMPATIBILITY IDEOGRAPH-2F801, calls for these could be intercepted and
> memory saved by not having them in the table.
> 

Maybe you're saying we have two tables but save space by not having the 
programmatically determinable names in those tables.  But I've already 
removed all the programmatically determinable names from the tables. 
The statistics I gave are for these pared down tables.  They're still huge.


> John
> 


Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About