This concerns my proposal to add the construct \c{...}. This would break any existing code that currently uses "\c{" to mean ";" (semi-colon). I would hope that no one does that outside of an obfuscated code contest, but I want to be sure that everyone here agrees with me. I posted earlier on this, and I said that "\c{" is not documented. Since then, I have found that it arguably is, and hence this post. In the pods about re's, it isn't documented, but in perlop it is: The character following "\c" is mapped to some other character by converting letters to upper case and then (on ASCII systems) by inverting the 7th bit (0x40). The most interesting range is from ’@’ to ’_’ (0x40 through 0x5F), resulting in a control character from 0x00 through 0x1F. A ’?’ maps to the DEL character. On EBCDIC systems only ’@’, the letters, ’[’, ’\’, ’]’, ’^’, ’_’ and ’?’ will work, resulting in 0x00 through 0x1F and 0x7F. (Note that there is an error in this statement, in that "\c\\" maps to two characters: 0x1c followed by a '\', so the construct cannot be used to cleanly generate a 0x1c, which is a FILE SEPARATOR.) But the statement indicates that it is permissible to use a "\c{" on ASCII platforms, without specifying what that might mean; and it turns out that that is a semi-colon. My hope is that people will say that in spite of perlop, it is ok in 5.14 to break code that uses "\c{". But here is your chance to say no. The proposal is currently to use this to specify control characters in a platform independent and mnemonic way, so that, for example, \c{ACK} would mean the ACKNOWLEDGE control character in both ASCII and EBCDIC. (It could be extended to accept other things as well, but H. Merijn's comments have convinced me that I need to think about that some more.) The cost is breaking code that uses "\c{" to mean the semi-colon, and extra code in the core that I would write. The advantages are a clearer platform-independent way to specify control characters, and a clean mnemonic way to get the FS character. Currently, the only way to get platform-independence is to go to utf8 by using the full character names in charnames. If it isn't ok to do this in 5.14, is it ok to add a deprecation message in 5.14, and go to it in 5.16? Should the other characters whose \c mappings aren't to controls get a deprecation message?Thread Previous | Thread Next