develooper Front page | perl.perl5.porters | Postings from May 2013

Re: SV_CHECK_THINKFIRST() needs to be rethought and further COW woes....

Thread Previous
From:
demerphq
Date:
May 10, 2013 05:54
Subject:
Re: SV_CHECK_THINKFIRST() needs to be rethought and further COW woes....
Message ID:
CANgJU+Uk8xZ3R8oV_BTxuDAirSzz2mrJOdHOTy4QPiamobiqyQ@mail.gmail.com
On Thursday, 9 May 2013, Nicholas Clark wrote:

> On Thu, May 09, 2013 at 10:02:44AM -0700, Father Chrysostomos wrote:
> >
> > On May 6, 2013, at 4:56 AM, Dave Mitchell <davem@iabyn.com<javascript:;>>
> wrote:
> >
> > > On Mon, Mar 25, 2013 at 01:20:32PM +0100, demerphq wrote:
> > >> On 25 March 2013 13:13, Dave Mitchell <davem@iabyn.com <javascript:;>>
> wrote:
> > >>> On Mon, Mar 25, 2013 at 10:14:18AM +0100, demerphq wrote:
> > >>>> To make things worse btw, IMO we have a real issue with COW, in that
> > >>>> a) COW strings no longer have the READONLY flag set, code that wants
> > >>>> to do an inplace modification based on this flag not being set will
> > >>>> silently start modifying COW strings.
> > >>>>
> > >>>> Lastly, I thought COW was disabled, but I see lots of code related
> to
> > >>>> it. Devel::Peek still reports keys() as being COW, etc.
> > >>>>
> > >>>> I think we have a real mess on our hands here. I feel really
> pessimistic.
> > >>>
> > >>> My (vague) understanding is that although general COW strings have
> been
> > >>> disabled, the pre-existing  shared hash key mechanism is still
> there, but
> > >>> rather than using the FAKE/READONLY flag combination, it uses the
> new IsCOW
> > >>> flag to indicate the sharedness
> > >>
> > >> Thats my point. With FAKE+READONLY code like HTML::Entities breaks in
> > >> an annoying but from the internals point of view harmless way. IOW, it
> > >> refuses to corrupt the PV that holds the key.
> > >>
> > >> Now on Perl5.16 the "annoying" part is gone, but as the SV is no
> > >> longer READONLY something like HTML::Entities will think it is normal,
> > >> and then modify the PV. Assuming modifying the PV doesnt cause a SEGV
> > >> something like HTML::Entities will silently corrupt the PL_strtab (the
> > >> master hash table) and any hash that mentions that key prior to it
> > >> being corrupted. From that point on Perl is in undefined state. The
> > >> master string table will contain a HEK that contains a string which
> > >> does not match the hash value for the string, and at that point almost
> > >> anything could happen.
> > >>
> > >> IOW, FAKE+READONLY still told code "be careful, here be dragons"
> > >> whereas "COW" doesn't say this at all.
> > >
> > > So, the question is what we do for 5.18? I'm not really comfortable
> enough
> > > with understanding how CPAN modules expect and handle FAKE+READONLY to
> > > know whether we should revert this change for 5.18 (or even how easy
> it is
> > > to revert). Or whether we should do both: have the isCOW flag, but also
> > > set FAKE+READONLY? Or perhaps do both for hash keys, but leave alone
> > > in the (disabled) proper string COW code?
> > >
> > > FC, do you have any thoughts?
> >
> > Originally, SvREADONLY meant that a scalar was read-only, that $x=3
> would croak.  When shared hash key scalars came along, this bit gained a
> new meaning, that the string buffer is read-only, even if the scalar itself
> is modifiable.  The result was that many XS modules ended up being subtly
> wrong.
> >
> > I did an audit of XS modules using SvREADONLY (which, admittedly, may
> not have been as exhaustive as I had hoped).  In most cases, XS modules
> were erroneously rejecting scalars which (from Perl space at least) were
> not read-only.  By stopping COW from using that flag, I automatically fixed
> those.  Of the remaining modules, most were buggy either way and would need
> to be fixed regardless of whether perl used SvREADONLY for COW.
>
> Yes, but arguably it also changes the failure mode from "right-side" to
> "wrong-side". In that, before, XS code that was unaware of the 5.8.0
> change would fail without corrupting internal state. Whereas now, XS code
> that is unaware will write to memory it should not, and failure will be
> subtle, and potentially much later and in unrelated code


this is precisely my concern.

yves
on an ipad


> XS code that uses C API calls such as sv_setpv(), set_pvn() and similar
> will always work. The problem is that there's this massive grey area about
> how much of the internal representation XS code is permitted to rely on,
> which makes it really difficult to do any sort of significant rearranging
> of the internals, because a lot of code on CPAN makes a lot of assumptions
> about what it's legal to do, and those assumptions have remained unchanged/
> unchallenged in over a decade.
>
> In turn this comes back to the basic problem that there never was an "API"
> designed. XS has "happened" as a side effect of authors looking at the
> internals and copying bits that worked. And those internals make a lot of
> direct data structure accesses, rather than vectoring through function
> calls,
> which means that
>
> 1) the internal data structures are exposed
> 2) it's hard to impossible to intercept accesses and create compatibility
>    veneers
>
>
> I'm not anything is wrong or right here. I'm saying that anything is going
> to have to be a compromise, based on guesswork, judgement calls and
> possibly
> even luck.
>
>
> > So, in short, I think HTML::Entities needs to be corrected.
> >
> > We need to encourage people to use SvPV_force or sv_force_normal before
> assuming that a PV is modifiable.
>
> Completely agree.
>
>
> Nicholas Clark
>


-- 
perl -Mre=debug -e "/just|another|perl|hacker/"

Thread Previous


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About