develooper Front page | perl.perl5.porters | Postings from January 2001

[ID 20010108.002] regular expression bug: /\byz\s*d/ fails tomatch if "yz" appears more than once in string

Thread Next
From:
pserris
Date:
January 8, 2001 01:42
Subject:
[ID 20010108.002] regular expression bug: /\byz\s*d/ fails tomatch if "yz" appears more than once in string
Message ID:
200101080941.BAA21914@polo.transmeta.com

This is a bug report for perl from pserris@polo.transmeta.com,
generated with the help of perlbug 1.28 running under perl v5.6.0.


-----------------------------------------------------------------
[Please enter your report here]

The following code prints "ok" in perl 5.005_03 but prints "bad" in
5.6.0:

    $_ = "xyz yz d"; print /\byz\s*d/ ? "ok" : "bad"

Here is the debugging output when running "perl -Dr":

Compiling REx `\byz\s*d'
size 8 first at 1
rarest char d at 0
rarest char z at 1
   1: BOUND(2)
   2: EXACT <yz>(4)
   4: STAR(6)
   5:   SPACE(0)
   6: EXACT <d>(8)
   8: END(0)
anchored `yz' at 0 floating `d' at 2..2147483647 (checking anchored) stclass `BOUND' minlen 3
Omitting $` $& $' support.

EXECUTING...

Guessing start of match, REx `\byz\s*d' against `xyz yz d'...
Found anchored substr `yz' at offset 1...
Found floating substr `d' at offset 7...
This position contradicts STCLASS...
Trying anchored substr starting at offset 2...
Found anchored substr `yz' at offset 4...
Contradicts floating substr `d', giving up...
Match rejected by optimizer
bad
Freeing REx: `\byz\s*d'


It seems to fail only when the substring "yz" appears more than once in
$_.  If $_ is something like "xyy yz d", then it matches correctly.

The following regular expressions also should match the original
string but do not:

    /\syz\s*d/
    /\syz \s*d/

However, the following regex seems to work fine:

    /\by[z]\s*d/

Replacing the "\s*" with a plain space also matches correctly.

This bug also shows up in 5.7.0 built in a similar fashion as the one
below and in activestate perl 5.6.0.

[Please do not change anything below this line]
-----------------------------------------------------------------
---
Flags:
    category=core
    severity=critical
---
Site configuration information for perl v5.6.0:

Configured by pserris at Wed Dec 27 14:22:35 PST 2000.

Summary of my perl5 (revision 5.0 version 6 subversion 0) configuration:
  Platform:
    osname=linux, osvers=2.2.15, archname=i686-linux
    uname='linux polo 2.2.15 #1 smp fri jun 2 10:14:56 pdt 2000 i686 unknown '
    config_args='-der -DDEBUGGING'
    hint=previous, useposix=true, d_sigaction=define
    usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef
    useperlio=undef d_sfio=undef uselargefiles=define 
    use64bitint=undef use64bitall=undef uselongdouble=undef usesocks=undef
  Compiler:
    cc='cc', optimize='-O2', gccversion=egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)
    cppflags='-fno-strict-aliasing -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64'
    ccflags ='undef'
    stdchar='char', d_stdstdio=define, usevfork=false
    intsize=4, longsize=4, ptrsize=4, doublesize=8
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=4, usemymalloc=n, prototype=define
  Linker and Libraries:
    ld='cc', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib
    libs=-lnsl -lndbm -lgdbm -ldl -lm -lc -lposix -lcrypt
    libc=/lib/libc-2.1.2.so, so=so, useshrplib=false, libperl=libperl.a
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-rdynamic'
    cccdlflags='-fpic', lddlflags='-shared -L/usr/local/lib'

Locally applied patches:
    

---
@INC for perl v5.6.0:
    /home/pserris/perl-5.6/lib/5.6.0/i686-linux
    /home/pserris/perl-5.6/lib/5.6.0
    /home/pserris/perl-5.6/lib/site_perl/5.6.0/i686-linux
    /home/pserris/perl-5.6/lib/site_perl/5.6.0
    /home/pserris/perl-5.6/lib/site_perl
    .

---
Environment for perl v5.6.0:
    HOME=/home/pserris
    LANG (unset)
    LANGUAGE (unset)
    LD_LIBRARY_PATH=/usr/openwin/lib:/usr/lib:/lib:/usr/local/lib:/usr/local/X11/lib:/usr/local/contrib/kde/lib:/usr/local/contrib/qt/lib
    LOGDIR (unset)
    PATH=.:/home/pserris/bin/linux:/home/pserris/bin:/usr/local/contrib/bin:/usr/local/bin:/usr/local/sbin:/proj/sw/i386-linux/bin:/bin:/usr/bin:/sbin:/usr/sbin:/usr/X11R6/bin:/usr/local/tex/bin:/usr/local/procmail/bin:/usr/local/contrib/nmh/bin:/project/hw/bin:/project/hw/bin/i386-linux-libc6:/home/pserris/bin
    PERL_BADLANG (unset)
    SHELL=/bin/csh


Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About