On Sun, Feb 12, 2012 at 5:02 PM, Father Chrysostomos via RT <perlbug-followup@perl.org> wrote: > On Mon Feb 06 07:19:37 2012, xdaveg@gmail.com wrote: >> Then when something wants to use that string as a source of bytes, >> should Perl (a) just dump out whatever bytes it uses internally for >> its implementation? Or (b) should it convert the internal >> representation to some standard representation? Or (c) should it blow >> up? > > (a) is what Perl currently does, as Leon Timmerman said. > > By (b) I presume you mean to treat \xff as \xff regardless of how it is > stored internally, which makes sense. Sort of. What I meant is that (a) is "whatever we do" and (b) is "a specific encoding". Those are likely to be similar, but one is vague and mutable and the other specific and fixed. Such a promise would persist under the usual back-compatibility rules even if we changed the internal representation in the future for some reason. It could also mean that we could choose give UTF-8 and not "utf8" (i.e. lax, internal encoding) -- and would croak if we can't translate from the internal to UTF-8. For example, for a string with wide characters used as in in-memory file, we could promise to translate from the internal encoding to UTF-8 when the handle is read. That would make it resemble a disk file encoded in UTF-8, requiring the ":encoding(UTF-8)" flag and so on. Thus some function that is passed a handle to read shouldn't know or care whether it's an in memory string or an on-disk file -- though the *programmer* would need to know what encoding they expect to receive given their particular application. > An in-memory scalar could be considered a byte stream. Or it could just > be considered a string of characters. My bias is strongly that it should be a byte-stream, which is why I'm only considering how we choose to take a string of (wide) characters and make it into a byte stream in some standard way: (a) "whatever" (b) "a promise" and (c) "boom!" -- DavidThread Previous | Thread Next