develooper Front page | perl.perl5.porters | Postings from September 2023

DAVEM TPF Grant#3 June, July, August 2023 report

Thread Next
Dave Mitchell
September 8, 2023 14:27
DAVEM TPF Grant#3 June, July, August 2023 report
Message ID:
(this report covers three months)

This is my monthly report on work done during June-August 2023 covered
by my TPF perl core maintenance grant.

I fixed a performance regression bug related to my 'multiconcat'
optimisation work from about 5 years ago.

Other than that, I restarted my work on my "make stack reference counted"
branch, got it into a working state, and merged it into blead.
See below for a detailed explanation.

     11:02 GH #21360: Severe regex/concatenation performance regression
     49:45 make stack reference counted
     18:13 process p5p mailbox
     79:00 TOTAL (HH::MM)


Understanding the "stack not reference counted" issue.

I've been asked to include an explanation as to why "making the perl stack
reference counted" is a Good Thing, and why it's burning up so many hours
in TPF funding.

Internally, perl is mainly based around Scalar Value (SV) structures,
which hold a single value. Entities such as $x or $a[0] are all SVs. These
SVs include a reference count (RC) field which, when it reaches zero,
triggers the freeing of the SV. For the basics, consider the following

    sub f {
        my $x = ...;
        my $y = ...;
        return \$y;
    my $ref = f();

On return from the function, the SV associated with $x is freed in a
timely manner. This is because its RC starts as 1, and is reduced to 0 on
scope exit. Conversely, the SV bound to $y has its RC increased to 2 by a
reference being taken to it, then back down to 1 when $y goes out of
scope. So it lives on, accessible as $$ref. This is all good.

Now, perl is a stack-based engine. Internally this means that it has a
stack of pointers to SVs, and such pointers are pushed on and popped off
as perl executes ops. For example, the action of perl's '+' operator is

* pop two SV pointers off the argument stack;
* add together the numeric values of those two SVs;
* store the result in another SV;
* push a pointer to that new SV onto the stack.

For brevity's sake, I shall in future refer to "pushing an SV" where I
mean "pushing a pointer to an SV onto the argument stack".

Now we come to the issue. As a "premature optimisation", perl doesn't
increase an SV's RC when pushing onto the stack, nor decrease it when
popping. This has the obvious danger that an SV could be be freed while
still on the stack, and thus something like an add operator could access a
freed SV (and thus undef value), or even worse, the SV could have been
reallocated in the meantime and have a completely unrelated value.

Related to this, the @_ argument array doesn't normally reference-count
its contents. For normal arrays, the expression '$a[0] = "abc"' will
create an SV, which has the value "abc", and which has a RC of 1 to
account for the pointer to the SV which is stored in the array. When the
array is freed, the RC of each SV in the array is decremented, and so the
elements of the array are typically freed too.  @_ doesn't do this.
Instead, when a function is called, the arguments are pushed onto the
stack (without the RC being bumped), then the list of SV pointers are
moved from the stack into @_, again without the RCs being adjusted.

Thus items in @_ are in danger of being prematurely freed.

Here is a classic example of the bug. Examine this code carefully:

    @a = qw(aaa bbb);

    sub f {
        # on entry, $_[0] is an alias of $a[0], 
        #           $_[1] is an alias of $a[1], 
        print "@_\n"; # correctly prints "aaa bbb"

        @a = (); # this prematurely frees $_[0] and $_[1]

        # this causes the two just-freed SVs to be reallocated
        my $x = 'xxx:yyy'; my @x = split /:/, $x;

        # but the two reallocated SVs are still referenced by @_
        print "@_\n"; # incorrectly prints "xxx yyy"

This may sound horrendous, and in a way it is. Put in practice, this
doesn't happen as often as you might expect. There is usually something
else keeping the  elements of @_ alive, so the bug is rarely encountered
in day-to-day code. But when it does, it can be difficult to track down.

Also, it sabotages code-fuzzers. From time to time, volunteers run jobs
which create perl "programs" out of small random fragments and attempt to
run them. If one crashes (in the sense of a SEGV or ASAN error), then that
indicates a bug in the perl interpreter itself. However, we stopped
accepting such bug reports, becuase the vast majority of them turned out
to be variants on "stack not refcounted", but it was taking some time to
analyse each report and reach that conclusion.

So fixing this bug would be a Good Thing. However, it turns out that
fixing it is rather hard. This design flaw has been in the heart of the
per core for 25+ years, and there's about 30,000 lines of C code directly
related to implementing all the stack operators (basically just about
every op in perl works from the stack).

In particular, it's an all-or-nothing situation: you can't have half the
ops adjusting reference counts when pushing/popping while the other half
leave them unadjusted.

The way I eventually worked out how to achieve this was by initially
wrapping nearly every op function (around 300 of them) with code that
adjusts the reference counts of all its arguments on the stack, calls the
"real" function, then adjusts any return values in the stack. 

This allowed a "big switch" to be turned on in one go that activated
reference counting across the entirety of the perl core.

Then in relative leisure, each individual function can be re-written to 
work directly on an RC stack and not require slow wrapping.

The initial wrapping work, involving around 70 commits, was merged into
blead on 16th August. from that point it is now possible to build a perl
with the PERL_RC_STACK configuration option which will now run safely,
fixing at least least 70 issue tickets and allowing fuzzers to be be used
again, while running approximately 30% slower (based on approximate time
to run the test suite).

A second set of commits, merged on 4th Sept, unwrapped many of the common
and/or easy ops, leaving around 140 out of 314 still to unwrap. At a very
rough glance this is now around 10% slower than "vanilla" blead. I haven't
done any proper benchmarking yet, though - I plan to do that after I've
unwrapped all the hot ops.

Note that none of this work (or at least, very little) currently affects a
default build of the perl interpreter: it is only when perl is built with
the PERL_RC_STACK option that reference-counting is enabled. Without that,
perl behaves and performs just as before (although many of the ops have
now been rewritten to use a new API to manipulate the stack: which in
theory should behave in a non-changed way on non-PERL_RC_STACK builds).

It is intended that eventually, PERL_RC_STACK will become the default,
and then the only, build option.

There is still much work to be done, including

* unwrapping more ops,
* fixing some common CPAN distributions (and/or perl core) where they
  don't work under PERL_RC_STACK builds (note that most already do).
* optimising some of the hot ops
* making common XS code not needing the slowdown of wrapping,

Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About