develooper Front page | perl.perl5.porters | Postings from July 2014

[perl #122330] Pathological performance of a pattern match

Thread Previous
From:
perlbug-followup
Date:
July 19, 2014 21:30
Subject:
[perl #122330] Pathological performance of a pattern match
Message ID:
rt-4.0.18-26985-1405804728-858.122330-75-0@perl.org
# New Ticket Created by  (Andreas J. Koenig) 
# Please include the string:  [perl #122330]
# in the subject line of all future correspondence about this issue. 
# <URL: https://rt.perl.org/Ticket/Display.html?id=122330 >


(Second try to get it into perlbug; first one was several hours ago)

I think this bugreport deserves to be a quiz. Test your intuition.

Sample string to test the regexps against:

  sprintf "%scould%snot%sopen%s", ("x"x10000)x4;

Regexp 1:

  qr!.*(?:could ?not (?:open|connect|find))!

Regexp 2 (the difference is it's case insensitive):

  qr!.*(?i:could ?not (?:open|connect|find))!

Note: neither matches and this is correct.

Question: which is faster and by how much?

Answer: regexp 1 is much more than 100 times slower than the regexp 2.

On my machine this program takes 1-2 wallclock seconds:

  time $p -le 'my $x = sprintf "%scould%snot%sopen%s", ("x"x10000)x4;print $x =~ m!.*(?:could ?not (?:open|connect|find))! ? "not ok" : "ok";' 

And on all those perls it takes less than 0.01 wallclock seconds when I
add the "i".

Historical evidence: this is not a regression, at least not since 5.6. I
have witnessed the same slowness on all my perls between 5.6.1 and
bleadperl.

Competition evidence: with 'use re::engine::PCRE;' both regexps are
fast.

Enjoy,
-- 
andreas

Summary of my perl5 (revision 5 version 21 subversion 2) configuration:
  Commit id: 3db23aeceed93497d1456928f0d3561a32f74a02
  Platform:
    osname=linux, osvers=3.14-1-amd64, archname=x86_64-linux-ld
    uname='linux k83 3.14-1-amd64 #1 smp debian 3.14.5-1 (2014-06-05) x86_64 gnulinux '
    config_args='-Dprefix=/home/sand/src/perl/repoperls/installed-perls/perl/v5.21.1-177-g3db23ae/127e -Dmyhostname=k83 -Dinstallusrbinperl=n -Uversiononly -Dusedevel -des -Ui_db -Uuseithreads -Duselongdouble -DDEBUGGING=-g'
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=undef, usemultiplicity=undef
    use64bitint=define, use64bitall=define, uselongdouble=define
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-fwrapv -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_FORTIFY_SOURCE=2',
    optimize='-O2 -g',
    cppflags='-fwrapv -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
    ccversion='', gccversion='4.8.3', gccosandvers=''
    intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
    ivtype='long', ivsize=8, nvtype='long double', nvsize=16, Off_t='off_t', lseeksize=8
    alignbytes=16, prototype=define
  Linker and Libraries:
    ld='cc', ldflags =' -fstack-protector -L/usr/local/lib'
    libpth=/usr/local/lib /usr/lib/gcc/x86_64-linux-gnu/4.8/include-fixed /usr/include/x86_64-linux-gnu /usr/lib /lib/x86_64-linux-gnu /lib/../lib /usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib
    libs=-lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc -lgdbm_compat
    perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc
    libc=libc-2.19.so, so=so, useshrplib=false, libperl=libperl.a
    gnulibc_version='2.19'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
    cccdlflags='-fPIC', lddlflags='-shared -O2 -g -L/usr/local/lib -fstack-protector'


Characteristics of this binary (from libperl): 
  Compile-time options: HAS_TIMES PERLIO_LAYERS PERL_DONT_CREATE_GVSV
                        PERL_HASH_FUNC_ONE_AT_A_TIME_HARD PERL_MALLOC_WRAP
                        PERL_NEW_COPY_ON_WRITE PERL_PRESERVE_IVUV
                        PERL_USE_DEVEL USE_64_BIT_ALL USE_64_BIT_INT
                        USE_LARGE_FILES USE_LOCALE USE_LOCALE_COLLATE
                        USE_LOCALE_CTYPE USE_LOCALE_NUMERIC USE_LOCALE_TIME
                        USE_LONG_DOUBLE USE_PERLIO USE_PERL_ATOF
  Built under linux
  Compiled at Jul 18 2014 17:06:15
  @INC:
    /home/sand/src/perl/repoperls/installed-perls/perl/v5.21.1-177-g3db23ae/127e/lib/site_perl/5.21.2/x86_64-linux-ld
    /home/sand/src/perl/repoperls/installed-perls/perl/v5.21.1-177-g3db23ae/127e/lib/site_perl/5.21.2
    /home/sand/src/perl/repoperls/installed-perls/perl/v5.21.1-177-g3db23ae/127e/lib/5.21.2/x86_64-linux-ld
    /home/sand/src/perl/repoperls/installed-perls/perl/v5.21.1-177-g3db23ae/127e/lib/5.21.2
    .


Thread Previous


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About