develooper Front page | perl.perl5.porters | Postings from August 2010

[perl #77414] bug report

Thread Next
From:
Dave U . Random
Date:
August 25, 2010 00:33
Subject:
[perl #77414] bug report
Message ID:
rt-3.6.HEAD-5116-1282668134-1747.77414-75-0@perl.org
# New Ticket Created by  Dave U . Random 
# Please include the string:  [perl #77414]
# in the subject line of all future correspondence about this issue. 
# <URL: http://rt.perl.org/rt3/Ticket/Display.html?id=77414 >


Subject: Inconsistent re backtracking behaviour for regular expressions like /\s*\p{Dash}/, /\s*\p{Dash}{1}/, and /\s*-/ matching input like '- ' and missing match for the first variant.
Message-Id: <5.12.1_3760_1282664416@my-PC>
Reply-To: loomisk@trash-mail.com
To: perlbug@perl.org


This is a bug report for perl from loomisk@trash-mail.com,
generated with the help of perlbug 1.39 running under perl 5.12.1.

Hello everyone!

Using the following code only lines 2 and 3 match:

print '- ' =~ /\s*\p{Dash}/;    # Version 1, no match
print '- ' =~ /\s*\p{Dash}{1}/; # Version 2, match
print '- ' =~ /\s*-/;           # Version 3, match

Debugging the regex makes clear why:

Compiling REx "\s*\p{Dash}"
synthetic stclass "ANYOF[\11\12\14\15 ][{unicode_all}+utf8::Dash]".
Final program:
   1: STAR (3)
   2:   SPACE (0)
   3: ANYOF[{unicode}+utf8::Dash] (15)
  15: END (0)
stclass ANYOF[\11\12\14\15 ][{unicode_all}+utf8::Dash] minlen 1 
Matching REx "\s*\p{Dash}" against "- "
Matching stclass ANYOF[\11\12\14\15 ][{unicode_all}+utf8::Dash] against "- " (2 chars)
   1 <-> < >                 |  1:STAR(3)
                                  SPACE can match 1 times out of 2147483647...
   2 <- > <>                 |  3:  ANYOF[{unicode}+utf8::Dash](15)
                                    failed...
   1 <-> < >                 |  3:  ANYOF[{unicode}\-...+utf8::Dash](15)
                                    failed...
                                  failed...
Contradicts stclass... [regexec_flags]
Match failed
Freeing REx: "\s*\p{Dash}"

The first \s* matches the second character of "- " and the \p{Dash} fails, since the regex does not backtrack beyond the last space. But there should be a match for this re and input data...

Version 3 obviously matches because of some internal optimization (seraching for plain "-"), and Version 2 should normally be the exact equivalent to 1, but this one backtracks and matches correctly.

Cheers, Andrew---
Flags:
    category=core
    severity=medium
---
Site configuration information for perl 5.12.1:

Configured by SYSTEM at Fri May 14 00:24:46 2010.

Summary of my perl5 (revision 5 version 12 subversion 1) configuration:
   
  Platform:
    osname=MSWin32, osvers=5.00, archname=MSWin32-x86-multi-thread
    uname=''
    config_args='undef'
    hint=recommended, useposix=true, d_sigaction=undef
    useithreads=define, usemultiplicity=define
    useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
    use64bitint=undef, use64bitall=undef, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='C:/Perl/site/bin/gcc.exe', ccflags ='-DNDEBUG -DWIN32 -D_CONSOLE -DNO_STRICT -DHAVE_DES_FCRYPT -DUSE_SITECUSTOMIZE -DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_SYS -DUSE_PERLIO -DPERL_MSVCRT_READFIX -DHASATTRIBUTE -fno-strict-aliasing -mms-bitfields',
    optimize='-O2',
    cppflags='-DWIN32'
    ccversion='', gccversion='3.4.5 (mingw-vista special r3)', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=undef, longlongsize=8, d_longdbl=define, longdblsize=8
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='__int64', lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='C:\Perl\site\bin\g++.exe', ldflags ='-L"C:\Perl\lib\CORE"'
    libpth=\lib
    libs=-lkernel32 -luser32 -lgdi32 -lwinspool -lcomdlg32 -ladvapi32 -lshell32 -lole32 -loleaut32 -lnetapi32 -luuid -lws2_32 -lmpr -lwinmm -lversion -lodbc32 -lodbccp32 -lcomctl32 -lmsvcrt
    perllibs=-lkernel32 -luser32 -lgdi32 -lwinspool -lcomdlg32 -ladvapi32 -lshell32 -lole32 -loleaut32 -lnetapi32 -luuid -lws2_32 -lmpr -lwinmm -lversion -lodbc32 -lodbccp32 -lcomctl32 -lmsvcrt
    libc=msvcrt.lib, so=dll, useshrplib=true, libperl=perl512.lib
    gnulibc_version=''
  Dynamic Linking:
    dlsrc=dl_win32.xs, dlext=dll, d_dlsymun=undef, ccdlflags=' '
    cccdlflags=' ', lddlflags='-mdll -L"C:\Perl\lib\CORE"'

Locally applied patches:
    ACTIVEPERL_LOCAL_PATCHES_ENTRY
    d956618 Make Term::ReadLine::findConsole fall back to STDIN if /dev/tty can't be opened
    321e50c Escape patch strings before embedding them in patchlevel.h

---
@INC for perl 5.12.1:
    C:/Perl/site/lib
    C:/Perl/lib
    .

---
Environment for perl 5.12.1:
    HOME (unset)
    LANG (unset)
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=C:\Python31\Lib\site-packages\PyQt4\bin;c:\Program Files\NVIDIA Corporation\PhysX\Common;C:\Python31\;;C:\Perl\site\bin;C:\Perl\bin;C:\Program Files\ActiveState Komodo Edit 5\;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Program Files\Microsoft SQL Server\100\Tools\Binn\;C:\Program Files\Microsoft SQL Server\100\DTS\Binn\;C:\Program Files\Common Files\Acronis\SnapAPI\;C:\Perl\bin;C:\Perl\site\bin;C:\Program Files\QuickTime\QTSystem\;C:\Tcl\bin;C:\wxruby\ruby\bin;C:\wxruby\bin;C:\Program Files\Common Files\Shoes\0.r1134\..;C:\Python31\Lib\site-packages\PyQt4\bin;c:\Program Files\NVIDIA Corporation\PhysX\Common;C:\Python31\;;C:\Perl\site\bin;C:\Perl\bin;C:\Program Files\ActiveState Komodo Edit 5\;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Program Files\Microsoft SQL Server\100\Tools\Binn\;C:\Program Files\Microsoft SQL Server\100\DTS\Binn\;C:\Program Fi
 les\Common Files\Acronis\SnapAPI\;C:\Perl\bin;C:\Perl\site\bin;C:\Program Files\QuickTime\QTSystem\;C:\Program Files\IDM Computer Solutions\UltraEdit\
    PERL_BADLANG (unset)
    SHELL (unset)


Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About