develooper Front page | perl.perl5.porters | Postings from August 2014

[perl #122564] Slow regexp match with balanced parens

Thread Next
Ed Avis
August 19, 2014 16:49
[perl #122564] Slow regexp match with balanced parens
Message ID:
# New Ticket Created by  "Ed Avis" 
# Please include the string:  [perl #122564]
# in the subject line of all future correspondence about this issue. 
# <URL: >

This is a bug report for perl from,
generated with the help of perlbug 1.39 running under perl 5.18.2.

[Please describe your issue here]

I understand that certain regexp constructs are no longer 'regular' and can take exponential time.  Nonetheless I was surprised by the slowness of this program:

$_ = 'zx<x<x<zx<x<x<x<x<x<x<x<x<x,yx,yx>,yx<x<x,yx>,yx>,yx>,x,x>,yx>x<x<x<x<x,yx>,yx>>,yx>x<x<x<x>>x<x<x,yx>,yx>>x<x<x,yx,yx>,yx<x<x,yx>,yx>,yx>,x,x>,yx>>x<';
alarm 3;

It is based on the example code in perlre for matching balanced
parentheses - here <> are used instead of () to make the regexp more
readable.  Here we are matching balanced parens which contain
comma-separated things, in other words an argument list.  The 'y'
character was originally whitespace.

Changing the inner 'y*' to just 'y' speeds up the match but I do not
understand why.  Could perl's regexp engine take advantage of this

I suggest that since the code fairly closely follows a recommended
recipe in the documentation, to perform a fairly common task, it is
worth making sure the regexp engine can handle it efficiently.
Alternatively, if there is a better way to write the regexp, the
example in the documentation might be changed to that style.  If it is
simply a typo in my regexp causing the search to explode then I

[Please do not change anything below this line]
Site configuration information for perl 5.18.2:

Configured by Red Hat, Inc. at Tue Jan  7 14:45:19 UTC 2014.

Summary of my perl5 (revision 5 version 18 subversion 2) configuration:
    osname=linux, osvers=3.11.9-200.fc19.x86_64, archname=x86_64-linux-thread-multi
    uname='linux 3.11.9-200.fc19.x86_64 #1 smp wed nov 20 21:22:24 utc 2013 x86_64 x86_64 x86_64 gnulinux '
    config_args='-des -Doptimize=-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches  -m64 -mtune=generic -Dccdlflags=-Wl,--enable-new-dtags -Dlddlflags=-shared -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches  -m64 -mtune=generic -Wl,-z,relro  -Dshrpdir=/usr/lib64 -DDEBUGGING=-g -Dversion=5.18.2 -Dmyhostname=localhost -Dperladmin=root@localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dprefix=/usr -Dvendorprefix=/usr -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl5 -Dsitearch=/usr/local/lib64/perl5 -Dprivlib=/usr/share/perl5 -Dvendorlib=/usr/share/perl5/vendor_perl -Darchlib=/usr/lib64/perl5 -Dvendorarch=/usr/lib64/perl5/vendor_perl -Darchname=x86_64-linux-thread-multi -Dlibpth=/usr/local/lib64 /lib64 /usr/lib64 -Duseshrplib -Dusethreads -Duseithreads -Dusedtrace=/usr/bin/dtrace -Duselargefiles -Dd_semctl_semun -Di_db -Ui
 _ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl=n -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dd_gethostent_r_proto -Ud_endhostent_r_proto -Ud_sethostent_r_proto -Ud_endprotoent_r_proto -Ud_setprotoent_r_proto -Ud_endservent_r_proto -Ud_setservent_r_proto -Dscriptdir=/usr/bin -Dusesitecustomize'
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=define, usemultiplicity=define
    useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
    use64bitint=define, use64bitall=define, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
    cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic',
    cppflags='-D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
    ccversion='', gccversion='4.8.2 20131212 (Red Hat 4.8.2-7)', gccosandvers=''
    intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
    ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='gcc', ldflags =' -fstack-protector'
    libpth=/usr/local/lib64 /lib64 /usr/lib64
    libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc -lgdbm_compat
    perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
    libc=, so=so, useshrplib=true,
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,--enable-new-dtags'
    cccdlflags='-fPIC', lddlflags='-shared -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -Wl,-z,relro '

Locally applied patches:
    Fedora Patch1: Removes date check, Fedora/RHEL specific
    Fedora Patch3: support for libdir64
    Fedora Patch4: use libresolv instead of libbind
    Fedora Patch5: USE_MM_LD_RUN_PATH
    Fedora Patch6: Skip hostname tests, due to builders not being network capable
    Fedora Patch7: Dont run one io test due to random builder failures
    Fedora Patch9: Fix find2perl to translate ? glob properly (RT#113054)
    Fedora Patch10: Update h2ph(1) documentation (RT#117647)
    Fedora Patch11: Update pod2html(1) documentation (RT#117623)
    Fedora Patch12: Disable ornaments on perl5db AutoTrace tests (RT#118817)
    Fedora Patch14: Do not use system Term::ReadLine::Gnu in tests (RT#118821)
    Fedora Patch15: Define SONAME for
    Fedora Patch16: Install to -Dshrpdir value
    Fedora Patch18: Fix crash with \\&$glob_copy (RT#119051)
    Fedora Patch19: Fix coreamp.t rand test (RT#118237)
    Fedora Patch20: Reap child in case where exception has been thrown (RT#114722)
    Fedora Patch21: Fix using regular expressions containing multiple code blocks (RT#117917)
    Fedora Patch200: Link XS modules to with EU::CBuilder on Linux
    Fedora Patch201: Link XS modules to with EU::MM on Linux

@INC for perl 5.18.2:

Environment for perl 5.18.2:
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PERL_BADLANG (unset)

This email has been scanned by the Symantec Email service.
For more information please visit

Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About