develooper Front page | perl.perl5.porters | Postings from February 2015

Re: [perl #123918] regex end of line match very slow

Thread Previous
From:
Karl Williamson
Date:
February 25, 2015 20:08
Subject:
Re: [perl #123918] regex end of line match very slow
Message ID:
54EE2B97.3030009@khwilliamson.com
On 02/25/2015 12:55 PM, Matthew Horsfall (alh) wrote:
> On Wed, Feb 25, 2015 at 11:32 AM, Paul Salazar
> <paul.salazar@testspectrum.com> wrote:
>>> So pre-COW:
>>>
>>>       the regex engine *always* copied after a successful match in the
>>> presence
>>>       of $& etc, going quadratic always, regardless of whether the string
>>> is
>>>       subsequently modified or not; (it actually didn't copy in the just
>>>       presence of captures, and just returned garbage in $1 etc if the
>>>       string was modified; this was a tradeoff of performance over
>>>       correctness).
>>>
>>>
>> All makes sense except I'm not seeing the pre-COW going quadratic. Same
>> example ran on 5.8
>
> So, this was fine in 5.17.7 up until:
>
> 1a904fc88069e249a4bd0ef196a3f1a7f549e0fe is the first bad commit
> commit 1a904fc88069e249a4bd0ef196a3f1a7f549e0fe
> Author: Father Chrysostomos <sprout@cpan.org>
> Date:   Sun Nov 25 12:57:04 2012 -0800
>
>      Disable PL_sawampersand
>
>      PL_sawampersand actually causes bugs (e.g., perl #4289), because the
>      behaviour changes.  eval '$&' after a match will produce different
>      results depending on whether $& was seen before the match.
>
>      Using copy-on-write for the pre-match copy (preceding patches do that)
>      alleviates the slowdown caused by mentioning $&.  The copy doesn’t
>      happen unless the string is modified after the match.  It’s now a
>      post- match copy.  So we no longer need to do things differently
>      depending on whether $& has been seen.
>
>      PL_sawampersand is now #defined to be equal to what it would be if
>      every program began with $',$&,$`.
>
>      I left the PL_sawampersand code in place, in case this commit proves
>      immature.  Running Configure with -Accflags=PERL_SAWAMPERSAND will
>      reënable the PL_sawampersand mechanism.
>
> Perhaps something else is going on here?
>
> -- Matthew Horsfall (alh)
>

As I vaguely recall, there is code in regexec that used to be able to 
quit prematurely if there was no PL_sawampersand, but now has to go to 
completion.

Thread Previous


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About