develooper Front page | perl.perl5.porters | Postings from February 2007

Re: Future Perl development

Marvin Humphrey
February 7, 2007 07:21
Re: Future Perl development
Message ID:

On Feb 7, 2007, at 4:09 AM, Mark Overmeer wrote:

> And for 7/8bit you would like to keep track of the character-set used
> in the string, such that you can automatically convert to unicode when
> need.  And filenames defined inside your program to the charset  
> used on
> a particular file-system.  And... implicit conversions where we  
> require
> explicit conversions now.

Wow, internalizing the Encode module.  What a beautiful thought.

> Hum... so each string needs an associated charset label (which also
> determines the number of bytes per character) and each string  
> operation
> needs to be aware that operands may require conversion before use...
> Maybe: if both encodings are different, than always convert both to  
> U32.

I think you'd end up at worst case memory usage often enough that you  
might as well default to 32 when reading in from filehandles, etc,  
but offer the option of compressing individual strings.

> Sounds like a lot of work, but rather straight forward for most of
> the way.

It's fun to think about, though I don't think any use at all of 32- 
bit string chars would be realistic without a major version increment  
or a fork.  While mind-bogglingly wasteful memory habits and and  
implicit conversion are both in the Perl spirit, that magnitude of  
spike in memory usage would render Perl unsuitable for some  
percentage of the applications it's currently used for.

But the savings in opportunity cost would be *vast*.  Having one  
string type -- and having it be fixed-length to boot -- nukes the  
Gordian knot.

So please, disagree with me.  :)

Marvin Humphrey
Rectangular Research Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About