develooper Front page | perl.perl5.porters | Postings from February 2007

Re: Future Perl development

Mark Overmeer
February 7, 2007 14:47
Re: Future Perl development
Message ID:
* Nicholas Clark ( [070207 16:24]:
> On Wed, Feb 07, 2007 at 01:09:23PM +0100, Mark Overmeer wrote:
>> * Nicholas Clark ( [070207 11:52]:
>>> The *best* solution might well be fixed 7/8/16/32, using the smallest
>>> that fits.

>> And for 7/8bit you would like to keep track of the character-set used
>> in the string, such that you can automatically convert to unicode when
>> need.
> It's simpler to always convert to Unicode on the way in, and to $whatever on
> the way out. After all, (as I understand it) one of the features of Unicode
> is that it is a superset of all existing encodings. Hence why some of its
> choices for what gets distinct code points can seem rather cranky.

But it not always nice to loose the character-set used.  For instance,
when you write an e-mail daemon which automatically adds a disclaimer
under a message.  You would like to automatically preserve the character-
set as defined in the header of the message.
My suggestion was a direct response to your fixed 7/8/16/32 idea.  Don't
forget that ASCII/8bit processing is much cheaper than handling UTF8.
It's a spoil to translate everything to UTF8 and back again, all the time.

> >        And filenames defined inside your program to the charset used on
> > a particular file-system.  And... implicit conversions where we require
> IIRC Jarkko has again looked at that recently, and most operating systems
> have no sane API to find out what is being used on a particular mounted
> filing system.

I know.  For unixes, you really have to combine info from both mount and
/etc/mtab to get the deviation from the defaults... which must be known
beforehand.  That can be quite expensive.  Only the filename-length is
usually available via a syscall.  I am working on a platform abstraction
layer which goes far beyond Path::Class; which does take this knowledge
into account when available.

       Mark Overmeer MSc                                MARKOV Solutions                          Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About