On 17 October 2016 at 18:37, demerphq <demerphq@gmail.com> wrote: > On 17 October 2016 at 04:50, Zefram <zefram@fysh.org> wrote: >> Jorma Laaksonen wrote: >>>Any hint if I'm doing something wrong or not doing something I should >>>do? >> >> No, that's all supported usage. You're quite right about the behaviour >> being erroneous. > > Agreed. > > It seems to be a bug about unwinding .*? although it also interacts > with TRIE code in ways I dont entirely understand. (Making the code > not produce a TRIE fixes the bug, but on the other hand, so does > removing the .*?) > > Nevertheless I can fix the bug (while possibly introducing new bugs) > with the code in yves/fix_129897 > c09f087940c61f3b6e57e7cf5e5b7a4faa683420 > > I would prefer that Dave have a look into this, as I dont entirely > understand why my patch fixes things for this case, but that in most > other cases it is not needed. > > The key point is that when we fail a .*? match we should unwind and > reset any buffers we matched after our current point. But STAR and > PLUS do not initialize the proper member fields so that we can do this > unwinding properly. > > I have to admit that this bug is quite surprising. I would have > thought that if we have a bug like this that we fail our regex tests > completely, but apparently not. > > Of course, it may have to do with the fact that the form of this bug > is incredibly horrible. Having an unanchored .* at the beginning of a > pattern is a good way to make your regex quadratic on failure. (We may > trigger an optimisation that automagically adds the anchor, and we may > not....) > > So it may simply be that most times we dont trigger this bug, but I > admit its not obvious to me why not. Cause my analysis was wrong... Dave, forget it, nothing you need to poke into. Put simply, the "short-circuit" logic in the TRIE code should not trigger when there is a jump table. I have a patch ready, but i am having issues talking to the master repo right now. YvesThread Previous