Let me see if I can sum up my understanding of the issue of conflicts. Today, the behavior of regex modifiers is that if a valid regex modifier (one of 'cegimopsx') is seen after the closing delimiter of the regex, Perl will consume all valid modifiers. For example, in "s/foo/bar/ge", the "ge" is never interpreted as the "ge" operator. If no modifiers appear or if a non-modifier character appears after all valid modifiers are consumed, it is interpreted as something other than a regex modifier. Because of the rules of syntax, the next thing after a regex must be either an operator or a statement modifier or else a syntax error will occur. By adding letters to the list of statement modifiers, we create a conflict with any operators or statement modifiers that start with the same letter because those are the only things that today would be "legal" as a run-on with existing modifiers that would "break" when the newly added letter is consumed during parsing. To restate that differently: we only care about the *first* letter of operators and modifiers. For letters under discussion (including "r"), here is a list of conflicting operators and modifiers: d: [no conflicts] l: lt, le r: [no conflicts] t: [no conflicts] u: unless, until So "r" is not a problem (which is good, since we already added it). Both "t" and "d" are not problems. The problems are "u" and "l". For reference, here is a list of letters that are *not* already used as regex modifiers and that do *not* appear at the start of *any* operator or modifier: b, d, h, j, k, q, t, v, y, and z. Any of these could be used as modifiers today without a problem. At the risk of taking the design discussion in circles, it occurs to me that the whole conflict problem just goes away if we use "U" and "L" instead of "u" and "l". (We can use either "t" or "d" for the third since they don't conflict). I don't particularly care about making mutually-exclusive regex modifiers visually distinctive, since improper use should just be a syntax error anyway, so I don't think that using upper case "means anything" or has to set any precedent in that regard. It's just a character space that has no conflicts. So that's my proposal: * use "U", "L", and either "d" (for "dual"/"dumb") or "t" (for "text"/"traditional") * throw a syntax error if more than one of these appears in the list of modifiers * leave the run-on deprecation as is and make run-ons a syntax error in 5.16 to eliminate any future conflict issue That's simple and fixes the problem now without messing with features, making assumptions about parsing ambiguous situations or introducing wacky dual-letter regex modifiers. All those in favor? -- DavidThread Previous | Thread Next