develooper Front page | perl.perl5.porters | Postings from January 2014

Re: Marking a scalar as an unupgradable binary blob.

Thread Previous | Thread Next
January 28, 2014 11:17
Re: Marking a scalar as an unupgradable binary blob.
Message ID:
demerphq wrote:
>     Perl has to deal with binary data, and that it should NOT be
>upgraded, even if concatenated with unicode strings,

Perl doesn't have a type distinction between octet strings and character
strings.  It would have been great if it did, but that ship sailed a
long time ago.  Perl has one string type, the elements of which are
integers, at least notionally codepoints, in some unsigned range that
depends on the platform's word size.  An octet string is conventionally
represented as a Perl string in which the elements are in the range
[0,255].  These are precisely the Perl strings that can be represented
in downgraded form, but we have also established that it is legitimate
to represent those Perl strings in upgraded form.  That a Perl string
is currently intensionally typed as an octet string does not damage the
legitimacy of any general-purpose means of representing Perl strings.

>So I consider Perl dieing (or perhaps warning, im flexible in that
>regard) when someone tries to concatenate a *designated* binary blob
>with unicode data exactly the right thing to do.

Unworkable.  You can't introduce a type distinction into the current
string type because it would break existing correct code such as

	sub decode_latin1 { $_[0] }
	$text_chars = $greeting_chars . decode_latin1($name_octets);

The octet/character distinction is currently purely intensional, and can
change with no explicit operation in the code.  Any labels that attempt
to track that distinction will therefore get out of synch in existing
correct code, even if they erase themselves at the first hint of an
ambiguous process.


Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About