develooper Front page | perl.perl5.porters | Postings from January 2004

[perl #24936] severe regexp performance problem with perl 5.8.*

Thread Next
January 18, 2004 17:18
[perl #24936] severe regexp performance problem with perl 5.8.*
Message ID:
# New Ticket Created by 
# Please include the string:  [perl #24936]
# in the subject line of all future correspondence about this issue. 
# <URL: >

This is a bug report for perl from,
generated with the help of perlbug 1.34 running under perl v5.8.3.

[Please enter your report here]

A severe regexp performance problem seems to exist in perl 5.8.*
Platforms this was reproduced on:
- FBSD-i386 4.8R , perl 5.8.0 on Pentium-M/1.4Ghz under VMware Workstation 4.0.5
  stock install of Perl under this distribution of FBSD.
- BSD/OS 4.1 (BSDI), perl 5.8.2 and 5.8.3 on Celereon/533Mhz and PIII/550Mhz
  default install of perl 5.8.2 and 5.8.3 per INSTALL file.

While upgrading from perl 5.005p3 to 5.8.2, some existing applications
seemed to take a severe performance hit in their central loops
containing a number of regexps.

Upon closer examination, it was determined that certain regexp's using
".*" constructs seem to execute more than 100 times slower than in
perl 5.005p3, resulting in multiple cascading failures in these applications.


$line = 'Jan 16 15:56:37 sonet sendmail[9368]: i0CGl7a1015852: to=<>,<>, ctladdr=<> (100/101), delay=4+04:09:29, xdelay=00:00:00, mailer=esmtp, pri=17436067,, dsn=4.0.0,stat=Deferred: Network is unreachable';

regexps executed performing at a ridiculously slow pace:
A) $line =~ /(?i).*(dable|z).*$/ ; # needs 33 ms to execute! 29 loop runs/s
   (note how the alternate strings 'dable' and 'z' do not occur in $line)
B) $line =~ /(?i).*?(dable|z).*?$/ ; # needs 33 ms to execute! 29 loop runs/s
C) $line =~ /(?i).*?(?:dable|z).*?$/ ; # needs 33 ms to execute! 29 loop runs/s

compare to:
D) $line =~ /(?i).*(able|z).*$/ ; # needs 0.07 ms to execute, 2700 loop runs/sec.
E) $line =~ /(?i).*(dable|z)$/  ; # needs 0.2 ms to execute, 1500 loop runs/sec.
F) $line =~ /(?i)(dable|z).*$/  ; # needs 0.2 ms to execute, 1500 loop runs/sec.

While leading .* is redundant at most, the bahaviour gets outright bizarre
given the performance of A) through C) being strongly dependent on the length
of $line and whether the ()-enclosed alternate strings exist or not.

Also noteworthy: Why is D) performing so much better than E) and F) ?

We can safely assume that LOTS of less than perfectly designed regexps
exist in the field, matching the above.

Thanks for your consideration,

Kai Schlichting

[Please do not change anything below this line]
Site configuration information for perl v5.8.3:

Configured by kai at Thu Jan 15 23:51:39 EST 2004.

Summary of my perl5 (revision 5.0 version 8 subversion 3) configuration:
    osname=bsdos, osvers=4.1, archname=i386-bsdos
    uname='bsdos 4.1 bsdi bsdos 4.1 kernel #8: fri sep 5 11:44:24 edt 2003 i386 '
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef
    useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=undef use64bitall=undef uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
    cc='cc', ccflags ='-fno-strict-aliasing -I/usr/local/include',
    cppflags='-fno-strict-aliasing -I/usr/local/include'
    ccversion='', gccversion='egcs-2.91.66 19990314 (egcs-1.1.2 release)', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=4, prototype=define
  Linker and Libraries:
    ld='ld', ldflags =' -L/usr/X11/lib -L/usr/local/lib'
    libpth=/usr/local/lib /usr/shlib /shlib /lib /usr/lib /usr/X11/lib
    libs=-lutil -lbind -ldl -lm -lc
    perllibs=-lutil -lbind -ldl -lm -lc
    libc=/shlib/, so=so, useshrplib=true,
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='  -Wl,-rpath,/usr/local/lib/perl5/5.8.3/i386-bsdos/CORE'
    cccdlflags='-fPIC', lddlflags='-shared -x  -L/usr/X11/lib -L/usr/local/lib'

Locally applied patches:

@INC for perl v5.8.3:

Environment for perl v5.8.3:
    LANG (unset)
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PERL_BADLANG (unset)

Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About