develooper Front page | perl.perl5.porters | Postings from February 2007

Re: Future Perl development

Thread Previous | Thread Next
Mark Overmeer
February 7, 2007 14:59
Re: Future Perl development
Message ID:
* Marvin Humphrey ( [070207 22:25]:
> On Feb 7, 2007, at 10:37 AM, Mark Overmeer wrote:
> Space occupied by the charset labels isn't my concern.  The scenario  
> I'm worried about is where somebody has calibrated the memory  
> consumption of an string-manipulating application to fit within  
> available RAM, or is reasonably close to threshold by happenstance.
> Say someone reads in a string that occupies 300MB when encoded as  
> UTF-8.  Say it's mostly ASCII, but has a few code points above the  
> BMP thrown in -- musical symbols like the sixteenth note (U+1D161),  
> or what have you.  Ka-boom, now that string occupies more than a gig.

Well, normalizing into 32bit or UTF8, it is to be decided and more
given as example.  You even may decide to store strings dependent on
efficiency: if you see that ik grows over 100k you use slow but smaller
UTF8, otherwise full 32bit.  If the string is over 2M, you use huffman
or gzip compressed 32bit...  Whole new areas of optimization are possible
when you add an "encoding/charset" field to each string.  But my main
target is to hide explicit recodings.

> Defaulting to 32-bit storage forces the programmer to deal with worst- 
> case scenarios right away.

No, it will make all programs slow right away.  Requesting system
resources is expensive.  And how would you protect the same guy from
not allocating a 400MB string where he tuned for max 300MB?

> What I was getting at, though, was that a sudden, dramatic increase  
> in worst-case-scenario RAM requirements shouldn't be considered  
> backwards compatible.

5.12 does not need to be backwards compatible to this extend.

       Mark Overmeer MSc                                MARKOV Solutions                         

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About