Front page | perl.perl5.porters |
Postings from April 2001
[PATCH bleadperl] [ID 20010426.002] Word boundry regex [...]
Thread Previous
|
Thread Next
From:
Hugo
Date:
April 28, 2001 12:36
Subject:
[PATCH bleadperl] [ID 20010426.002] Word boundry regex [...]
Message ID:
200104281323.OAA06602@crypt.compulink.co.uk
In <200104261733.f3QHX7K10625@mail.ocentrix.net>, Ben writes:
[...]
:#!/usr/bin/perl
:
:use strict;
:
:my $text = "Charles Bronson";
:
:$text =~ s/\B\w//g;
:
:print "here it is: $text\n\n";
This correctly returned "C B" on 5.005_03, but "Cals Bosn" on later
perls. This appears to have been caused by a 1999 patch from Ilya,
http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/1999-11/msg00286.html,
which changed the pickup of the previous character from:
tmp = (s != startpos) ? UCHARAT(s - 1) : PL_regprev;
to:
tmp = (s != startpos) ? UCHARAT(s - 1) : '\n';
PL_regprev is still supported, so I don't understand why this change
was made. Reversing it as in the attached patch fixes the problem, and
passes all tests here, but I'm slightly concerned that I (and the test
suite) may be missing the original reason for this change - Ilya, have
you any memory of this?
Only one extra test, because I couldn't find a way to trigger the
problem in the BOUND case.
I'll send an analogous patch for 5.6.1 in a few days if this one is ok.
Hugo
--- regexec.c.old Wed Apr 11 14:14:26 2001
+++ regexec.c Sat Apr 28 14:07:14 2001
@@ -947,7 +947,7 @@
case BOUND:
if (do_utf8) {
if (s == startpos)
- tmp = '\n';
+ tmp = PL_regprev;
else {
U8 *r = reghop3((U8*)s, -1, (U8*)startpos);
@@ -969,7 +969,7 @@
}
}
else {
- tmp = (s != startpos) ? UCHARAT(s - 1) : '\n';
+ tmp = (s != startpos) ? UCHARAT(s - 1) : PL_regprev;
tmp = ((OP(c) == BOUND ? isALNUM(tmp) : isALNUM_LC(tmp)) != 0);
while (s < strend) {
if (tmp ==
@@ -990,7 +990,7 @@
case NBOUND:
if (do_utf8) {
if (s == startpos)
- tmp = '\n';
+ tmp = PL_regprev;
else {
U8 *r = reghop3((U8*)s, -1, (U8*)startpos);
@@ -1010,7 +1010,7 @@
}
}
else {
- tmp = (s != startpos) ? UCHARAT(s - 1) : '\n';
+ tmp = (s != startpos) ? UCHARAT(s - 1) : PL_regprev;
tmp = ((OP(c) == NBOUND ?
isALNUM(tmp) : isALNUM_LC(tmp)) != 0);
while (s < strend) {
--- t/op/subst.t.old Tue Aug 29 13:54:13 2000
+++ t/op/subst.t Sat Apr 28 13:57:22 2001
@@ -6,7 +6,7 @@
require Config; import Config;
}
-print "1..84\n";
+print "1..85\n";
$x = 'foo';
$_ = "x";
@@ -378,4 +378,8 @@
$_ = "C:/";
s/^([a-z]:)/\u$1/ and print "not ";
print "ok 84\n";
+
+$_ = "Charles Bronson";
+s/\B\w//g;
+print $_ eq "C B" ? "ok 85\n" : "not ok 85\n# \$_ eq '$_'\n";
Thread Previous
|
Thread Next