On 2005–07–28, at 08:51, Dominic Dunlop wrote:
> On 2005–07–27, at 20:31, John Narron wrote:
>> perl -e 'my $x = q[if ($h->{ALPHA}->{BETA}->{q{stuff}}) {] .
>> "\n" . q[
>> stuff($h, @_);] . "\n}\n\n"; $x x= 7238; $x =~ s/stuff/"stuff" .
>> ++$count/eg; eval $x'
>>
> Problem confirmed with the above script on Mac OS X. The tipping
> point for a debugging bleadperl@25218 with a stack limit of 8192k
> is $x x= 26197. For a 64-bit perl it's somewhat more than half
> that; for a production perl 5.8.6, it's much higher -- somewhere
> between 70 and 80,000. I'd put that difference down to the lack of
> debugging overhead and to optimisation, but I don't have an
> optimised bleadperl handy to check. The stack trace of a the
> crashed perl 5.8.6 looks like
> ...
> Thread 0 Crashed:
> 0 perl 0x0002ffd0 Perl_newSV + 16 (crt.c:300)
> 1 perl 0x00081164 Perl_av_fetch + 688 (crt.c:300)
> 2 perl 0x000b0268 Perl_pad_alloc + 236 (crt.c:300)
> 3 perl 0x000179d4 Perl_peep + 496 (crt.c:300)
> 4 perl 0x00017fe4 Perl_peep + 2048 (crt.c:300)
> 5 perl 0x00017fe4 Perl_peep + 2048 (crt.c:300)
> ... [you get the idea]
> 501 perl 0x00017fe4 Perl_peep + 2048 (crt.c:300)
> 502 perl 0x00017fe4 Perl_peep + 2048 (crt.c:300)
> 503 perl 0x00017fe4 Perl_peep + 2048 (crt.c:300)
> 504 perl 0x00017fe4 Perl_peep + 2048 (crt.c:300)
> 505 perl 0x00017fe4 Perl_peep + 2048 (crt.c:300)
> 506 perl 0x00017fe4 Perl_peep + 2048 (crt.c:300)
> 507 perl 0x00017fe4 Perl_peep + 2048 (crt.c:300)
> 508 perl 0x00017fe4 Perl_peep + 2048 (crt.c:300)
Attached is a comprehensive but simple-minded patch that simply stops
Perl_peep from recursing to a depth a greater than 8192. When this
happens, you get a warning, and the code gets run unoptimized. A
better approach would be to find out why the optimizer thinks it
needs such a large peephole in this case (more a picture window,
really) and persuade it to be less ambitious. But I don't understand
the optimizer at all -- hence the crude solution.
Run embed.pl after applying the patch. Passes all tests for me both
unthreaded and threaded.
The figure of 8192 is arbitrary: it's comfortably higher than the
deepest recursion I measured during make test -- 878 levels.
(Measurement code not in patch.) OTOH it's higher than the value that
John Narron reported as causing his crash. Why is this? Because with
an unoptimized build of bleadperl for Darwin, I can only make John's
example crash rather than hitting the limit enforced by the patch if
I reduce stack size to 2MB. With an optimized perl, I must reduce
stack size to 1MB to get a crash. I think it's fair to expect that
perl will have at least 4MB of stack to play with on most platforms.
This leaves some headroom after the depth limit of 8192 is reached,
even on platforms which use more stack than PowerPC. (If you really
want to change the limit, -DPERL_PEEP_LIMIT=your_number when
compiling op.c, although this will make the warnings test fail,
because it's expecting to see '8192'.)
The original report says
> FreeBSD 5.4 seems to have a default stack hardlimit of 64MB.
I think that either John must be mistaken, or FreeBSD's stackframes
are so incredibly large that somebody should look into shrinking them.
As the patch adds entry and exit code to a function that's called a
lot, here are some benchmarks from make minitest with perl@25260:
perl-current-optimized:
u=1.27 s=0.94 cu=44.60 cs=14.23 scripts=207 tests=49777
u=1.23 s=0.80 cu=44.73 cs=14.07 scripts=207 tests=49777
perl-current-patched-optimized:
u=1.27 s=0.86 cu=43.32 cs=14.41 scripts=207 tests=49777
u=1.24 s=0.82 cu=43.17 cs=14.04 scripts=207 tests=49777
perl-current-optimized-threads:
u=2.54 s=0.99 cu=104.48 cs=17.09 scripts=207 tests=49777
u=2.51 s=0.90 cu=104.40 cs=16.70 scripts=207 tests=49777
perl-current-patched-optimized-threads:
u=2.50 s=0.94 cu=106.50 cs=17.11 scripts=207 tests=49777
u=2.50 s=0.89 cu=106.29 cs=16.79 scripts=207 tests=49777
3% _better_ on user time for unthreaded; 2% penalty for threaded. Ah
well. There's another optimizer I don't understand. I'd say it was
down in the noise if it wasn't so repeatable...
--
Dominic Dunlop
Thread Previous
|
Thread Next