As we have seen in recent threads we have been somewhat schizophrenic in how we deal with strings. I believe I have a proposal which would allow us to bypass these problems while at the same time maintaining backwards compatibility. I believe that this solution is compatible with some other proposals like adding better support for case modifying options and things like "use unicode semantics" for regexes and stuff. My proposal is this: --------------------- Make it such that the utf8 flag on means that the string contains unicode codepoints encoded as utf8. When the utf8 flag is off an additional field in the SV would be used to determine what type of string the data contained. (I guess this would be a pointer to some struct or an offset into a table) If a string was not explicitly marked to be something else it would be default assumed to be Latin-1. (null pointer or offset=0) Two strings would only be legally concatenable if they were of the same type, or if there existed defined conversion routines from both types to Unicode. In the case of a string type mismatch both would be upgraded to utf8 according to their type. An exception to this rule would be a binary string type which would be concatable with anything, and which would never be modified nor cause anything else to be modified when concatenated with it. We would provide something like bless to mark strings as being of a particular charset and encoding combination. WRT Win32: All strings would be forced to unicode* and the widecharacter apis would be used (possibly unless the string was of type ANSI or the string was of type Binary in which case the 8 bit apis would be used). --------------------- Im not sure how this would impact XS. I think it would leave existing XS unchanged, and make new XS easier to write. But im open to being told im all wrong. :-) Yves * this would throw an error if the string was not of a type that can be converted to unicode. ps: I saw the proposal for a UPV type, im at a loss to understand how this would do anythign more than make the situation worse. -- perl -Mre=debug -e "/just|another|perl|hacker/"Thread Next