develooper Front page | perl.perl5.porters | Postings from March 2007

Re: The performance problem of 30678

Thread Previous | Thread Next
Nicholas Clark
March 23, 2007 14:06
Re: The performance problem of 30678
Message ID:
On Fri, Mar 23, 2007 at 02:23:35PM +0100, demerphq wrote:

> The solution was to make a temporary copy of the regexp struct and a
> few of its fields and then use it each time. However this leads to a
> performance problem in code like
> my $qr=qr/(\d)\1/;
> /$qr/ and print for 1..100;
> Where we essentially make a copy, use it to match, throw it away a
> hundred times.

Performance is pretty grim on some platforms

x86 Linux is fairly sane, although valgrind's malloc reacts more badly to
all the free()ing implied by PERL_DESTRUCT_LEVEL=2 than the regular glibc
malloc does:

30677 run normally

22.10user 0.27system 0:22.90elapsed 97%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+26074minor)pagefaults 0swaps
30677 run with PERL_DESTRUCT_LEVEL=2

22.03user 0.25system 0:22.89elapsed 97%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+26076minor)pagefaults 0swaps

30677 run under valgrind

1188.06user 4.12system 20:13.82elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (15major+29142minor)pagefaults 0swaps

30677 under valgrind with PERL_DESTRUCT_LEVEL=2

1188.40user 3.89system 20:13.49elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (1major+29365minor)pagefaults 0swaps


22.68user 0.26system 0:23.48elapsed 97%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+26079minor)pagefaults 0swaps

30678 under valgrind

1423.25user 4.97system 24:14.31elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+28904minor)pagefaults 0swaps

x86 FreeBSD hurts:

30677 run normally

real    0m14.330s
user    0m14.002s
sys     0m0.108s

30677 and with PERL_DESTRUCT_LEVEL=2

       14.25 real        13.86 user         0.11 sys

30678 run normally

real    0m15.393s
user    0m14.854s
sys     0m0.094s

30678 and with PERL_DESTRUCT_LEVEL=2

      560.45 real       520.39 user         0.09 sys

and Sparc Solaris turns to super-cooled treacle

30677 run normally

real    2m28.951s
user    2m28.359s
sys     0m0.540s

30677 and with PERL_DESTRUCT_LEVEL=2

real     2:29.1
user     2:28.5
sys         0.5

30678 run normally

real    2m38.266s
user    2m37.629s
sys     0m0.584s

30678 and with PERL_DESTRUCT_LEVEL=2

real  1:59:50.1
user  1:59:48.9
sys         0.8

[formatting differences are due to whether bash chose to use built in time,
or /usr/bin/time, depending on how it parsed and ran my command lines]

Yes, with PERL_DESTRUCT_LEVEL=2 Solaris now takes 2 hours to run t/op/pat.t

We seem to have hit pathological malloc behaviour. I'm not quite sure how,
or why, given that Linux copes.

But, digression, I do remember at my first job that they ran HP-UX. They'd
tried various architectures and HP-UX was best for the sort of code that
they ran. Then we had to get the code working on an NEC SX-4. [Grrr
"Super-UX". You keep using that word [super]. I do not think that it means
what you think that it means]
Anyway, it ran over 2**32 bytes of memory rather quickly, which upset parts
of their code. I tried it on the work Solaris box - it was also unhappy.
I tried it with Doug Lea's malloc - it was not unhappy.
So it's something to do with malloc. Turns out that the code was
realloc() to n, realloc() to n+1, etc, etc
It happened to be using the buffer for a grid of triangles, and I worked out
that it would asymptotically approach 2m items, for an infinite grid.
When I re-coded it to call malloc() once for that size, it was much happier.
So I think that the reason they used HP-UX at all was because HP's malloc()
had the best performance with poorly written code. I wonder how much hardware
business that pessimisation from HP won them.

So, anyway, time for someone to identify why we're hurting malloc. I won't
have time until 5.8.9 is out.

Nicholas Clark

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About