develooper Front page | perl.perl6.internals | Postings from January 2002

Re: Large string patch

Thread Previous | Thread Next
From:
Tim Bunce
Date:
January 1, 2002 05:22
Subject:
Re: Large string patch
Message ID:
20020101114731.J75939@dansat.data-plan.com
On Mon, Dec 31, 2001 at 06:53:29AM -1000, David & Lisa Jacobs wrote:
> From: "Dan Sugalski" <dan@sidhe.org>
> > >Agreed.  I'll probably have the encoding structure provide the
> terminating
> > >bytes.  As a side note don't we also have to split UTF-16 into UTF-16BE
> and
> > >UTF-16LE (big endian and little endian)?
> >
> > I think UTF-16 can be a single encoding. The little/big endian issue can
> be
> > dealt with by an I/O filter.
> 
> Will an IO filter have an opportunity to inject itself when we mmap a file?
> It was because you said you wanted this capability that I thought we were
> maintaining the serialized forms of unicode encodings.  Otherwise, I would
> be highly tempted to convert the internal representation of all unicode
> strings into and array of 4 byte ints (allows for much faster processing).

That's an assumption that may not always/often be true. Especially given the
impact on cpu data caches.

Tim.

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About