On Mon, Jan 16, 2012 at 11:11 AM, Aristotle Pagaltzis <pagaltzis@gmx.de> wrote: > Yes it is. > > If you have a byte string represented by a UTF8=1 scalar, then the PV > buffer has a UTF8-encoded sequence of 8-bit integer values. To get the > value of each string element, which here represents one byte each, into > an actual byte, you have to do the equivalent of downgrading. > > You might as well actually downgrade. In utf8.pm, downgrade is defined as «Converts in-place the internal representation of the string from UTF-X to the equivalent octet sequence in the native encoding (Latin-1 or EBCDIC)». > So downgrading *is* the way to get bytes from a scalar that contains > a byte string. Huh? > We desperately need at least two extra terms – one to disambiguate > platonic characters from atomic string elements (which can be either > platonic characters or platonic bytes), and one to disambiguate bytes > as in the underlying PV buffer representation of a string from platonic > bytes as atomic elements of the string at the Perl semantic level. Maybe > we even need more terms yet, which I didn’t think of here. > > Until then, threads like this will be exercises in confusion as people > mean different things when they say the same words – in fact often mean > the same different things, only at opposite times, making communication > all but an accident: no one can either hear what the other truly is > saying or likewise truly be heard in turn. Agreed.Thread Previous | Thread Next