develooper Front page | perl.perl5.porters | Postings from November 2016

Re: A possible new approach to COW - COW_META.

Thread Next
From:
Dave Mitchell
Date:
November 24, 2016 15:21
Subject:
Re: A possible new approach to COW - COW_META.
Message ID:
20161124152143.GG4785@iabyn.com
On Mon, Nov 07, 2016 at 10:38:05AM +0100, demerphq wrote:
> On 7 November 2016 at 10:12, Dave Mitchell <davem@iabyn.com> wrote:
> > Using the following benchmark file:
...
> > I get the following. bleadNC is blead built with -DPERL_NO_COW. These are
> > raw numbers - lower is better.

> I /think/ these numbers reflect the fact that the current iteration
> will COW a 1 byte string, and the old one wont cow it at all.

Yeah, what they appear to show is that both FC COW and Yves COW have a
high(ish) intrinsic overhead, which dominates for small strings (hence
FC COW skips for small strings).

However, I've looked into where this extra overhead comes from, and
against my expectations, it turns out that most of the cost is in
undeffing or freeing a COWed SV at the end of its use. Basically once
the SVf_IsCOW flag gets set, SVf_THINKFIRST(sv) becomes true, and
this triggers all the slowest code paths when
freeing/undeffing/reassigning an SV - i.e. it always calls
sv_force_normal_flags(), which as well as calling S_sv_uncow, also
checks whether its SvREADONLY, SvROK, isGV_with_GP, isREGEXP and SvVOK.

Also things like the actual refcount decrement in S_sv_uncow()
calculates CowREFCNT() multiple times, which is expensive.

I suspect this code could be heavily optimised and brought back in line to
match or nearly match the non-COW version.

Here are some further random and ill-thought-out thoughts.

FC's COW code avoids COW when the dest SV already has an allocated PVX
buffer and the string to copy is short - presumably on the grounds that
a Move() of a short string is cheaper than free()ing the old buffer.
However, the buffer will have to be freed at the end of the SV's life
one way or another, so its perhaps unfair to include the cost of free()
in the comparison.

As for the 'SvCUR() much less than SvLEN()' test that we added to avoid
things like 'push @a, $_ while <>' pushing a bunch of 4K SVs, perhaps
instead we could add an SV flag indicating that that this SV is not
COWable, then get readline() et al to set it???

Finally, I still don't like the idea of so much extra memory being used
for each COWed string. I vaguely wonder whether we could have a hybrid
system: use the spare byte at the end of the string if the string is
writeable, has space for a refcnt byte, and has refcnt < 255;
similarly, use the cpanel 'infinite refcnt' hack for static readonly
strings; then use the COW_META struct for otherwise unCOWable strings
(e.g. RC > 255, no RC byte). This would then require two COW flags to
indicate what type of COW it is. But that might end up complicated and
expensive.

I'm also still unclear what you're proposing in the way of integrating
COW_META with shared HEs. That may make the paragraph above completely
redundant.




-- 
The Enterprise is captured by a vastly superior alien intelligence which
does not put them on trial.
    -- Things That Never Happen in "Star Trek" #10

Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About