On 2/11/20 10:58 AM, Karl Williamson wrote: > I have been looking at the code in regcomp.c in regpiece() that deals > with qwuantifiers. > > After reordering things so that goto's don't cause it to jump back then > forth, some anonmalies became clear. I also found some potential easy > optimizations. > > I would expect that the results of parsing {1,} would be the same as > '+', and they both do generate the PLUS regnode, but the flags passed to > the higher level aren't set the same. This is true of the other > shortcuts '*' and '?' as well. > > I then tried to figure out what the consequences of those differences > are. Two of the flags WORST and SPSTART do not appear to ever be looked > at. Should we remove them, or dig to find out how they used to be used, > or might they come back again, and we should set them consistently? > > regpiece assumes that any quantifier whose upper limit is non-zero > causes the construct to not match the null string, and sets HASWIDTH. > That simply isn't true when quantifying a zero-width assertion. I > didn't look at what the optimizer does with that, but when I change that > a higher level warning is emitted: > > "Quantifier unexpected on zero-length expression " > > Now to the optimizations: I believe the quantifier {1,1} can simply be > optimized out. There are occurrences in our test suite of this; I > believe from Abigail. And I can see machine generated or interpolated > code ending up with this. So we don't need to create a loop that gets > executed precisely once. But there is {1,1}+, that has to be > considered; and that's easy to do. > > Generally, in {m,m}? the ? is a no-op and can be omitted. I think that research like this is important -- but we need to be mindful of where we are in our annual development cycle. Your findings and recommendations are likely to be at such a deep level inside the codebase that their implications will take time to learn. So we shouldn't be expecting to implement such recommendations in perl-5.32.0. Thank you very much. Jim KeenanThread Previous | Thread Next