develooper Front page | perl.perl5.porters | Postings from May 2015

Re: RFC: what to do about bitwise string operators

Thread Previous | Thread Next
May 2, 2015 18:35
Re: RFC: what to do about bitwise string operators
Message ID:
Karl Williamson wrote:
>So what to do if the string has non-ASCII characters and is in UTF-8?

I reckon the bitwise string ops should be defined to operate on logical
octets [\x00-\xff].  Whether an octet string is stored in upgraded form
shouldn't affect the logical result.  It should be *as if* the inputs
are downgraded internally, but with upgraded inputs it's OK for that
logical result to be output in upgraded form.  If there's a non-octet
in the string, croak.

>B) warn

If it warns, you still need to decide substantive behaviour other than
the warning.  So this option is really "warn, and return the same result
as currently".  Don't gloss over the retention of the dippy behaviour.

>C) Do the operation on the underlying code points (that is
>effectively convert to U32 or U64 before the operation, and convert
>back at the end)

That would imply that ~"\xaa" would be "\x{ffffff55}", rather than the
present "\x55".  If done consistently, it would be very surprising to
most existing users of the bitwise ops.  If only done for strings that
contain non-octets, then the behaviour is inconsistent, which is also
surprising.  See [perl #63574] which discussed this issue six years ago,
with a statement of "we decided on the inconsistent behaviour".

Consistent dotriacontet-string complementation might be worth putting
in a non-core module.  Anyone got a use case for it?

>D) Downgrade if possible and leave the result downgraded, or possibly
>upgrade the result.  I suppose warn if not possible to downgrade

Again that'd be "warn and apply the current crap semantics".


Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About