develooper Front page | perl.perl6.users | Postings from September 2020

Re: "ICU - International Components for Unicode"

Thread Previous | Thread Next
From:
Samantha McVey
Date:
September 27, 2020 11:08
Subject:
Re: "ICU - International Components for Unicode"
Message ID:
14828831.auFDDPTKBn@formerly-linux-5702
So MoarVM uses its own database of the UCD. One nice thing is this can 
probably be faster than calling to the ICU to look up information of each 
codepoint in a long string. Secondly it implements its own text data 
structures, so the nice features of the UCD to do that would be difficult to 
use.

In my opinion, it could make sense to use ICU for things like localized 
collation (sorting). It also could make sense to use ICU for unicode 
properties lookup for properties that don't have to do with grapheme 
segmentation or casing. This would be a lot of work but if something like this 
were implemented it would probably happen in the context of a larger 
rethinking of how we use unicode. Though everything is complicated by that we 
support lots of complicated regular expressions on different unicode 
properties. I guess first I'd start by benchmarking the speed of ICU and 
comparing to the current implementation.

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About