develooper Front page | perl.perl5.porters | Postings from March 2010

Re: Another optimization question: bsearch()

Thread Previous
From:
David Nicol
Date:
March 14, 2010 17:10
Subject:
Re: Another optimization question: bsearch()
Message ID:
934f64a21003141709m6b37b597k286630adbc57241f@mail.gmail.com
>> I particularly liked the concept of doing Unicode classes as DFAs on the
>> octets of the UTF-8 representations of the code points. That might be a
>> win
>> for us.
>
> But notice that he only implements two Unicode properties, gc and sc. This
> is likely much faster than our mechanism, but I think it takes significantly
> more space.

handwaving the effort involved to produce something usable, I wonder
if a DFA on nybbles, bit pairs, or even bits instead of octets would
be useful. The common cases would compress together, and the
transition tables would be much much smaller. Using flexible offsets,
they might even be sharable.

The trees would be bigger but have fewer branches at each stage.

Thread Previous


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About