On 2021-07-31 12:17 p.m., Darren Duncan wrote: > Now conversely, I don't have a problem with actually waiting until v5.38 to > fully implement the change IF 5.36 contained some kind of precursor to prepare > the way, such as that 5.36 would issue warnings for code with a "use 5.36" that > wasn't valid UTF-8, saying that this code might parse differently under "use > 5.38". That would let users know in a transitional version what might be a > problem before it is. So to clarify, I have a very specific proposal: 1. That a "use 5.36;" will behave the same with respect to the uft8 stuff as "use 5.34;", but that if the source file / input stream is not entirely valid UTF-8 under a strict interpretation, the Perl interpreter will issue a warning saying so and why it matters. 2. That a "use 5.38;", if the source file / input stream is not entirely valid UTF-8 under a strict interpretation, the Perl interpreter will issue a fatal error / die saying so and why it matters, and that as a result the parsing has failed. So a key thing is that the UTF-8 mode triggered by 5.36/5.38 is strict, doesn't use substitution characters or delete characters, it either passes the input unchanged as valid UTF-8 or it complains. If "use utf8;" already does this then its the same, and otherwise it is stricter. Since this isn't spelled the same as "use utf8;" the new feature doesn't need to be identical in every way, we don't have to limit ourselves to that and the issues of silent corruption from substitution/deleting being the implicit operation, if that is what it used to do. On a further point, unlike a lot of the other "use" statements, I assume there is no good reason for a single file to be a mixture of literal encodings, and so having multiple "use encoding" statements in a file, either explicit or implied by a "use 5.38" etc, should be considered an error, and any occurrence of one would be expected to describe the entire file and not just the lexical scope it appears in, unlike strict/warnings/etc, its not flipped on or off mid-file. -- Darren DuncanThread Previous | Thread Next