Front page | perl.perl6.internals |
Postings from January 2002
Re: on parrot strings
Thread Previous
|
Thread Next
From:
Nicholas Clark
Date:
January 18, 2002 15:48
Subject:
Re: on parrot strings
Message ID:
20020118234016.GE540@Bagpuss.unfortu.net
On Fri, Jan 18, 2002 at 05:24:00PM +0200, Jarkko Hietaniemi wrote:
> > As for character encodings, we're forcing everything to UTF-32 in
> > regular expressions. No exceptions. If you use a string in a regex,
> > it'll be transcoded. I honestly can't think of a better way to
> > guarantee efficient string indexing.
>
> I'm fine with that. The bloat is of course a shame, but as long as
> that's not a real problem for someone, let's not worry about it too
> much.
Forcing everything to UTF-32 in the API?
Or just forcing everything to UTF-32 until perl 6.0 is released, as trying
to do UTF-8 (and UTF-16 ...) regexps now is premature optimisation?
To me it seems that making UTF-32 do everything correctly which the real
world can use while encoding optimised versions are written is better than
having a snazzy 4 encoding autoswitcher that is wrong and therefore not
releasable to the world.
But I don't know about how the internals of all these things work, so I
may well be wrong on any technical detail.
Nicholas Clark
--
ENOCHOCOLATE http://www.ccl4.org/~nick/CV.html
Thread Previous
|
Thread Next