Front page | perl.perl6.internals |
Postings from January 2002
Re: on parrot strings
Thread Previous
|
Thread Next
From:
Jarkko Hietaniemi
Date:
January 18, 2002 14:11
Subject:
Re: on parrot strings
Message ID:
20020119001106.T25478@alpha.hut.fi
On Fri, Jan 18, 2002 at 01:40:26PM -0800, Steve Fink wrote:
> On Fri, Jan 18, 2002 at 10:08:40PM +0200, Jarkko Hietaniemi wrote:
> > ints, or 176 bytes. Searching for membership in an inversion list is
> > O(N log N) (binary search). "Encoding the whole range" is a non-issue
> > bordering on a joke: two ints, or 8 bytes.
>
> [Clarification from a noncombatant] You meant O(log N).
>
> I like the inversion list idea. But its speed is proportional to the
> toothiness of the character class, and while I have good intuition for
> what that means in 7-bit US-ASCII, I have no idea how bad it gets for
> other languages. "Vowels"? "Capital letters"? Would anyone ever want
> to select all Chinese characters with a particular radical?
>
> That's just lookup. We should also consider other character class
> operations: union, subtraction, intersection. They're pretty
Complement of an inversion list is neat: insert 0 at the beginning
(and append max+1), unless there already is one, in which case delete
the 0 (and shift the list and delete the max+1). Again, O(N).
(One could of course have a bit for a 'negative character class',
but that would in turn complicate the computations.)
--
$jhi++; # http://www.iki.fi/jhi/
# There is this special biologist word we use for 'stable'.
# It is 'dead'. -- Jack Cohen
Thread Previous
|
Thread Next