develooper Front page | perl.perl5.porters | Postings from January 2009

Re: Even faster Unicode character counting

Thread Previous | Thread Next
David Nicol
January 28, 2009 12:33
Re: Even faster Unicode character counting
Message ID:
With this in place, the same tests as an unpatched blead pass, which
is, all of them except the one that passes 38 tests even though it
thinks its supposed to only do 37 tests.

Acceleration is supposed to be provided by skipping bytes less than
128 eight at a time, by examining them as a U64.  When there is a high
bit, three tests are performed to determine which of the eight bytes
has the high bit set and normal counting proceeds from that point,
instead of starting over at s=alignment-point regardless of which byte
had the high bit.

The benefit is:
   (1) faster passing over runs of bytes all less than or equal to
127, which all have lengths of 1 in the lookup table.

The costs are
   (1) one extra comparison each "walking" go-around to check if we
are at a 64-bit alignment
   (2) extra examinations in the "running" loop will consume extra
ticks when the string is mostly characters greater than 127
   (3) unknown performance of bitmasking a U64 on a 32-bit architecture

cost #1 could be mitigated by unrolling the "walking" test into an
8-wide Duff's device, but hopefully I can restrain myself.

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About