Zefram wrote: > karl williamson wrote: >> If there's no opposition, we need to settle on what is the syntax is. >> Ben originally proposed (?~ I thought (?. was better because the tilde >> can be too easily confused with a hyphen, (?- which is also legal right >> after the question mark. > > I think this structure, with one extra char between the "?" and the flags, > is ideal. Either tilde or dot is fine by me, and I'm not at all worried > about getting ASCII chars confused with each other. Confusability is > a Unicode disease. > > The following chars already have a meaning immediately after "(?" in > a regexp (some of them more than one, depending on following chars): > > ! # & ' ( ) + - > 0 1 2 3 4 5 6 7 8 9 > : < = > ? > R > i m p s x > { | > > Earmarking all remaining letters for use by future flags, the remaining > available punctuation characters are: > > " $ % * , . / > ; > @ > [ \ ] ^ _ > ` > } ~ > > A few of those would interact badly with quoting syntax. I'd be happy > with us using any of the others. > > -zefram > Here's my reasoning for excluding from consideration most of these. I've scanned the tokenizer code, and there appear to be heuristics to decide if something is an interpolated variable, the end of the pattern, and the boundaries of character classes. So it seems a lot less dangerous to exclude $ % @ ; [ ] Also, quotes as you said, " ' ` And, since / is the common regex delimiter, excluding it seems like a good idea for human confusability issues, as well as anything that is a paired delimiter, so } is out. Similar concerns get rid of \ I'd rather keep _ in reserve as it is a word character, and we could decide to add it later for the same reason we accept it between digits in a number, a no-op for legibility, especially if we ever go to two character modifiers. That leaves * , . ^ ~ I'd rather not use ^, again for human usability issues, as that often is the first thing in a pattern. A comma doesn't seem to me to convey the right meaning, so we're down to * . ~ I didn't want to use tilde because of the visual confusability with -, and not * because of a number of things that start like (*PRUNE), so that left the period. And I thought the standard meaning of period of "any" was sort of appropriate. But I think * might be ok. I still haven't heard reasons against the period.Thread Previous | Thread Next