develooper Front page | perl.perl5.porters | Postings from April 2001

Re: [PATCH bleadperl] [ID 20010426.002] Word boundry regex [...]

Thread Previous | Thread Next
From:
Hugo
Date:
April 28, 2001 13:39
Subject:
Re: [PATCH bleadperl] [ID 20010426.002] Word boundry regex [...]
Message ID:
200104281941.UAA06955@crypt.compulink.co.uk
In <20010428153412.A21353@math.ohio-state.edu>, Ilya Zakharevich writes:
:On Sat, Apr 28, 2001 at 02:23:28PM +0100, Hugo wrote:
:> :my $text = "Charles Bronson";
:> :
:> :$text =~ s/\B\w//g;
:> :
:> :print "here it is: $text\n\n";
:> 
:> This correctly returned "C B" on 5.005_03, but "Cals Bosn" on later
:> perls. This appears to have been caused by a 1999 patch from Ilya,
:> http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/1999-11/msg00286.html
:,
:> which changed the pickup of the previous character from:
:>            tmp = (s != startpos) ? UCHARAT(s - 1) : PL_regprev;
:> to:
:>            tmp = (s != startpos) ? UCHARAT(s - 1) : '\n';
:> 
:> PL_regprev is still supported, so I don't understand why this change
:> was made. Reversing it as in the attached patch fixes the problem, and
:> passes all tests here, but I'm slightly concerned that I (and the test
:> suite) may be missing the original reason for this change - Ilya, have
:> you any memory of this?
:
:The only suggestion I have is that there might have been a case when
:this chunk is entered *before* PL_regprev is set.  Can this happen?

Not that I can see: regexec_flags() sets it whenever startpos == strbeg,
before any calls to find_byclass either indirectly (through intuit_start)
or directly.

Is it possible that lookbehind could get back to strbeg when PL_regprev
is not set, or not set correctly? These both seem to do the right thing:
  crypt% perl -wle '$_ = "abacad"; s/a(?<=\Ba)(.)/$1/g; print'
  abcd
  crypt% perl -wle '$_ = "abacad"; s/a(?<=\ba)(.)/$1/g; print'
  bacad
  crypt% 
.. with or without the patch, but I haven't traced through to see if they
DTRT internally.

:Aha, this is s///, not m//; there there is also a case of
:substitution-in-place:
:
: old-perl -wle "$_= q(abcd); s/\b./#/g; print"
: ####
:
: new-perl -wle "$_= q(abcd); s/\b./#/g; print"
: #bcd

?? old-perl in this case seems to be 5.005_03, new-perl any later perl
with or without my patch.

:IMO, old-perl is buggy, new-perl is correct.

Agreed.

:  But maybe the fix is not related to the patch in question.

I guess not.

Hugo

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About