* Nicholas Clark (nick@ccl4.org) [070207 16:24]: > On Wed, Feb 07, 2007 at 01:09:23PM +0100, Mark Overmeer wrote: >> * Nicholas Clark (nick@ccl4.org) [070207 11:52]: >>> The *best* solution might well be fixed 7/8/16/32, using the smallest >>> that fits. >> And for 7/8bit you would like to keep track of the character-set used >> in the string, such that you can automatically convert to unicode when >> need. > > It's simpler to always convert to Unicode on the way in, and to $whatever on > the way out. After all, (as I understand it) one of the features of Unicode > is that it is a superset of all existing encodings. Hence why some of its > choices for what gets distinct code points can seem rather cranky. But it not always nice to loose the character-set used. For instance, when you write an e-mail daemon which automatically adds a disclaimer under a message. You would like to automatically preserve the character- set as defined in the header of the message. My suggestion was a direct response to your fixed 7/8/16/32 idea. Don't forget that ASCII/8bit processing is much cheaper than handling UTF8. It's a spoil to translate everything to UTF8 and back again, all the time. > > And filenames defined inside your program to the charset used on > > a particular file-system. And... implicit conversions where we require > > IIRC Jarkko has again looked at that recently, and most operating systems > have no sane API to find out what is being used on a particular mounted > filing system. I know. For unixes, you really have to combine info from both mount and /etc/mtab to get the deviation from the defaults... which must be known beforehand. That can be quite expensive. Only the filename-length is usually available via a syscall. I am working on a platform abstraction layer which goes far beyond Path::Class; which does take this knowledge into account when available. -- MarkOv ------------------------------------------------------------------------ Mark Overmeer MSc MARKOV Solutions Mark@Overmeer.net solutions@overmeer.net http://Mark.Overmeer.net http://solutions.overmeer.net