When | & ^ ~ are executed on strings, or when the new |. &. ^. ~. operators are run, the internal representation of those strings is relied on (and hence exposed). This means different behaviors will often result on EBCDIC vs ASCII platforms. More importantly, whether a string is in UTF-8 or not may affect the result. There is no such problem if the string is comprised solely of ASCII characters (on ASCII machines or ASCII-equivalent characters plus the C1 controls on EBCDIC machines), which is why people may not have been bitten much by this in the past. So what to do if the string has non-ASCII characters and is in UTF-8? I see the following possibilities: A) no change from current behavior, document it better. (This is what will happen in v5.22) B) warn C) Do the operation on the underlying code points (that is effectively convert to U32 or U64 before the operation, and convert back at the end) D) Downgrade if possible and leave the result downgraded, or possibly upgrade the result. I suppose warn if not possible to downgrade E) **Your ideas here**Thread Next