develooper Front page | perl.perl5.porters | Postings from August 2012

Re: PERL_OLD_COPY_ON_WRITE

Thread Previous
From:
Nicholas Clark
Date:
August 3, 2012 08:14
Subject:
Re: PERL_OLD_COPY_ON_WRITE
Message ID:
20120803151426.GL9583@plum.flirble.org
On Sun, Dec 11, 2011 at 12:05:48PM -0800, Father Chrysostomos wrote:

> But that does not explain why it's 'somewhat dead code, never expected to
> go live'.  It wouldn't have to apply to XS module if it does go live.  And
> the commit you cite arranged flags in such a way that it *could* go live
> without affecting XS modules.  What I don't understand is how it is 'a
> placeholder on how to do it better', because I don't see the (supposedly
> obvious?) flaw in the current code. (Apart from the use of the UV to point
> to the next SV; but we could use SvLEN instead.)


Sorry for the very long delay in replying. There really need to be at least
3 of me to keep up with all the things that seem to fall to me to be done.


It's almost 10 years ago, so I can't be sure, but I think that I was going
for speed, not absolute memory savings. (This was before I realised that
memory use was probably a far more fruitful thing to attack.) I was
specifically interested in how $& ends up having to copy the scalar, but I
guess as much about speed as memory use.


The reason I tried it was because I figured out a design which seemed to
avoid a lot of complexity in the "obvious" approach.

The "obvious" approach is that if A is copied to B, then B is marked as a
copy of A, and A "owns" the buffer. There's already two classes of things -
owners of buffers, and cheap copies. Clearly this is fairly easy to unwind
if B is changed first - B points to A. However, if A is changed, then B needs
to be informed (in some fashion), which means that A also needs to point to
B.

But then I started wrestling with scenarios involving more copies:

So what happens if A is copied to B, then B is copied to C? Something like
this?

    A
    ^
    |
    B
    ^
    |
    C

Now what happens if B is modified? We probably want to get to this:

    A
    ^
    |
    C

but that seems to start being quite complex to achieve.

Also, if A is copied to B and to A is copied to C, should we end up with
something like this?

    A
   / \
  B   C

At which point it looks like the "buffer owner" needs to be able to handle
multiple backreferences, which starts to imply more complex book keeping
(such as how weak references are implemented). Which feels slow and complex.
And the next question for the above structure - what happens if A needs to
be altered - how does the code decide which of B or C to promote to the top.

So it all just seemed messy and complex.


The breakthrough was realising that one could treat the copies as all equal
to each other if one made them into a circularly linked list.

  A -> B -> C
  ^         |
  +---------+

A doubly linked list would have been nice, but I could only see space for
one pointer in PVIV, so singly linked list was all I could implement.
Additionally, the (then) implementation of SvOOK() using SvIVX() suggested
that overloading the use of SvIVX() in publicly visible string scalars was
safe, and something that XS code could already cope with.

The other part of the implementation that seemed like a winner was that the
COW scalars were structurally identical to regular scalars. Not only did
they look pretty much like other scalars to XS code, but it was also
possible to set up a COW copy without needed to realloc() anything, or
allocate additional storage. It also means that they "decomposed" to a
regular scalar very efficiently - for the last scalar that inherits the
buffer, no need to realloc() anything to make it "normal".


So it all looked promising - simple code, *and* no realloc()s. And yet
*still* I couldn't measure it as being any faster than the existing approach.
In particular, this comment in Perl_sv_gets():

    /* XXX. If you make this PVIV, then copy on write can copy scalars read
       from <>.
       However, perlbench says it's slower, because the existing swipe code
       is faster than copy on write.
       Swings and roundabouts.  */
    SvUPGRADE(sv, SVt_PV);

The swipe code being scalars that set SVs_TEMP, and sv_setsv stealing the
buffer.


Plus also one test still failed in Compress::Zlib, which I couldn't work out
why, and so I feared that there was some subtle assumption that I'd missed
that might cause other CPAN code to break. (Possibly at runtime, possibly in
interestingly subtle ways). At some point Artur Bergman made the point that
CPAN code would reasonably expect that if it copied something read only, the
copy would not be. But (obviously) a COW copy has to act read only. Which
troubled me. (The idea only came later of using #ifdef PERL_CORE in the
headers to avoid doing COW copies in XS code, after I'd concluded that I
couldn't get (general) COW to be fast enough.)


So I gave up on the whole thing, because even though it seemed about the most
lightweight approach possible, I still couldn't get it to the point where it
was measurably faster than not doing it, and it added complexity to the
(runtime) code, and it looked like it had a risk of subtly breaking CPAN code.
This was all quite frustrating.

We intentionally left the code in, still protected by #ifdefs, because
a) someone might be able to come back and fix it
b) it showed the locations where one (probably) needed to inject code to make
   a new, viable, copy on write scheme


I think that that's the whole story.

After writing the above out, I find that I'd actually already written a
similar but slightly longer description of how it works, here:

http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2007-06/msg00743.html

(but not what the problems were)



> I've been experimenting with a COW implementation that allocates strings
> with shared_he-shaped buffers to begin with, so they can be upgraded to
> real shared string scalars.  But the overhead of storing such a shared_he
> in the string table exceeds the cost of copying the string most of the
> time.  (The work wasn't wasted.  My experiment uncovered some existing COW
> bugs that I fixed recently.)

Sadly that doesn't surprise me.


One thing I did realise while trying to remember all of this, and make sense
of it, is that there might be a speedup possible by breaking (some)
encapsulation between sv.c and hv.c. Specifically, if a shared hash key
scalar is about to be modified (ie in Perl_sv_force_normal_flags(), and
SV_COW_DROP_PV is false), and it's the only owner of that shared hash key,
then instead of

a) calling SvGROW() to malloc() a buffer
b) copying the contents of the shared hash key to it
c) releasing the shared hash key

it ought to be possible to release the malloc()d block from the shared string
table, but not free it, effectively transferring ownership of it to the
scalar. That then uses the SvOOK() stuff to point SvPVX() past the (now not
needed) shared string table structures that precede the string in memory.
So no malloc(), no free() and no copying.

But that's quite a bit of complexity for I'm not sure how much gain.


Nicholas Clark

Thread Previous


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About