Karl Williamson <public@khwilliamson.com> wrote: :I have been looking at the code in regcomp.c in regpiece() that deals :with qwuantifiers. : :After reordering things so that goto's don't cause it to jump back then :forth, some anonmalies became clear. I also found some potential easy :optimizations. : :I would expect that the results of parsing {1,} would be the same as :'+', and they both do generate the PLUS regnode, but the flags passed to :the higher level aren't set the same. This is true of the other :shortcuts '*' and '?' as well. : :I then tried to figure out what the consequences of those differences :are. Two of the flags WORST and SPSTART do not appear to ever be looked :at. Should we remove them, or dig to find out how they used to be used, :or might they come back again, and we should set them consistently? I definitely think there's value in some digging, I'm happy to give that a go, time permitting. But of those I'm sure at least WORST would be from Ilya, quite likely SPSTART too, so digging is not guaranteed to lead to light. :regpiece assumes that any quantifier whose upper limit is non-zero :causes the construct to not match the null string, and sets HASWIDTH. :That simply isn't true when quantifying a zero-width assertion. I :didn't look at what the optimizer does with that, but when I change that :a higher level warning is emitted: : : "Quantifier unexpected on zero-length expression " : :Now to the optimizations: I believe the quantifier {1,1} can simply be :optimized out. There are occurrences in our test suite of this; I :believe from Abigail. And I can see machine generated or interpolated :code ending up with this. So we don't need to create a loop that gets :executed precisely once. But there is {1,1}+, that has to be :considered; and that's easy to do. : :Generally, in {m,m}? the ? is a no-op and can be omitted. I think it is worth trying to understand why \K{1,1} fails before eliding the general {1,1} case, since I suspect there's something fundamental going wrong there that will shed light on more than itself. HugoThread Previous | Thread Next