David Golden wrote: > On Fri, Aug 6, 2010 at 8:28 PM, Ævar Arnfjörð Bjarmason > <avarab@gmail.com> wrote: >> +0.5 >> >> I must say I don't like the tradeoff of making users pound their shift >> keys in perpetuity to produce /UL instead of /ul to maintain backwards >> compatibility with an unlikely-to-occur bit of syntax. >> >> Has anyone done tests to find out if cases like C</foo/lt "bar"> >> actually occur in the wild (e.g. on CPAN). Or is this just a runaway >> backwards compat hypothetical? > > I think it's a hypothetical. > > As I said originally, I'm 100% happy to declare run-on's to be a > syntax error, declare the parser to have a bug for not detecting it to > date, break whatever corner cases we happen to break, and just go > straight to /l and /u for modifiers. > > That didn't seem to get a lot of traction and Jesse seemed to indicate > he'd prefer that kind of breakage to have lexical scope using a > feature. > > Given the extra complexity of a "temporary" feature, I think uppercase > /L and /U (with the option of lower case synonyms in 5.16) is a > reasonable compromise. > > -- David > I'm back, and hope I have gotten some perspective. I really want to make progress, decide on something good, and go with it. I was willing to go with any of the options I had laid out; David's is a modification of one of them. But I note that both he and Jesse had just recently called the idea of using uppercase modifiers "crazy". So, I too would rather not condemn people to the extra keystroke in perpetuity; and I'm still not sure that we have to do it even in 5.14. David's analysis is correct (and is very similar to what Eric wrote a while back: http://www.nntp.perl.org/group/perl.perl5.porters/2010/05/msg160173.html ). To summarize, and be more precise, the cases where any of the potential new modifiers 'r', 'l', 'd', or 't' could be something other than modifiers, are: 1) a string of modifiers ending in 'lt' 2) a string of modifiers ending in 'le' 3) a string of modifiers ending in 'until' 4) a string of modifiers ending in 'unless' That means that the 'd' and 'r' modifiers are without conflicts, as David noted. But, when the parser sees the 'l' and 'u' characters in what so far is a string of regex modifiers, it can't be sure without look-ahead if they are modifiers or the beginning of the keywords above. But I believe, that it can resolve all ambiguities with sufficient amount of lookahead, and that in all but one case, such lookahead is trivial. The 'l' and 't' modifiers cannot appear together because of our rules about their use, so 'lt' has to be the less-than operator; and note that 'e' is legal only in s/// Here's some pseudo code: case 'u': if the next character is an 'n', not a modifier; otherwise is. case 'l': if the next char is a 't', not a modifier else if the next char is not an 'e', is a modifier else if not in a s///, not a modifier else if the next character beyond the e is alpha, is a modifier else if the next thing in the input is an operand, not a modifier else is a modifier So the apparent ambiguity is trivially resolved for /u. The only case where you need to look ahead more than one character is s///le. Correct me if I'm wrong with Perl, but I believe that what can legally follow a binary operator must be an operand. I don't know how hard it is to lookahead and distinguish between an operand or non-operand; I haven't investigated. Perhaps someone can tell me. If it's easy, then the ambiguity is easily resolvable, and we don't need the capital letters in 5.14. If it's not easy, here's a counterproposal. We use the lowercase letters in 5.14. In the single subcase where we give up figuring it out, we assume that the 'le' are not modifiers, and print out a warning, saying to use capital L to get the modifier meaning. That is, in 5.14, we add both 'l', and 'L' modifiers and they both normally mean the same thing; but one can use the 'L' if necessary to cope with our laziness in not figuring out what was meant to begin with. The alternative is back to my original proposal, which is to tell them in the warning to spell it '...el' instead. I actually like that better, as it doesn't require a new modifier. So, I guess I'm pushing my proposal still. What is different, is that I think we now are agreed that the problematic cases are very few, which is why my proposal makes sense at all. I hope I've persuaded you that there really is only one case that may not be easily resolvable. And I still think that the appropriate warning is sufficient to handle it.Thread Previous | Thread Next