develooper Front page | perl.perl5.porters | Postings from May 2013

Re: SV_CHECK_THINKFIRST() needs to be rethought and further COW woes....

Thread Previous | Thread Next
From:
Father Chrysostomos
Date:
May 9, 2013 17:03
Subject:
Re: SV_CHECK_THINKFIRST() needs to be rethought and further COW woes....
Message ID:
C8420088-F86B-4A12-BC9E-A2A9DE050E56@cpan.org

On May 6, 2013, at 4:56 AM, Dave Mitchell <davem@iabyn.com> wrote:

> On Mon, Mar 25, 2013 at 01:20:32PM +0100, demerphq wrote:
>> On 25 March 2013 13:13, Dave Mitchell <davem@iabyn.com> wrote:
>>> On Mon, Mar 25, 2013 at 10:14:18AM +0100, demerphq wrote:
>>>> To make things worse btw, IMO we have a real issue with COW, in that
>>>> a) COW strings no longer have the READONLY flag set, code that wants
>>>> to do an inplace modification based on this flag not being set will
>>>> silently start modifying COW strings.
>>>> 
>>>> Lastly, I thought COW was disabled, but I see lots of code related to
>>>> it. Devel::Peek still reports keys() as being COW, etc.
>>>> 
>>>> I think we have a real mess on our hands here. I feel really pessimistic.
>>> 
>>> My (vague) understanding is that although general COW strings have been
>>> disabled, the pre-existing  shared hash key mechanism is still there, but
>>> rather than using the FAKE/READONLY flag combination, it uses the new IsCOW
>>> flag to indicate the sharedness
>> 
>> Thats my point. With FAKE+READONLY code like HTML::Entities breaks in
>> an annoying but from the internals point of view harmless way. IOW, it
>> refuses to corrupt the PV that holds the key.
>> 
>> Now on Perl5.16 the "annoying" part is gone, but as the SV is no
>> longer READONLY something like HTML::Entities will think it is normal,
>> and then modify the PV. Assuming modifying the PV doesnt cause a SEGV
>> something like HTML::Entities will silently corrupt the PL_strtab (the
>> master hash table) and any hash that mentions that key prior to it
>> being corrupted. From that point on Perl is in undefined state. The
>> master string table will contain a HEK that contains a string which
>> does not match the hash value for the string, and at that point almost
>> anything could happen.
>> 
>> IOW, FAKE+READONLY still told code "be careful, here be dragons"
>> whereas "COW" doesn't say this at all.
> 
> So, the question is what we do for 5.18? I'm not really comfortable enough
> with understanding how CPAN modules expect and handle FAKE+READONLY to
> know whether we should revert this change for 5.18 (or even how easy it is
> to revert). Or whether we should do both: have the isCOW flag, but also
> set FAKE+READONLY? Or perhaps do both for hash keys, but leave alone
> in the (disabled) proper string COW code?
> 
> FC, do you have any thoughts?

Originally, SvREADONLY meant that a scalar was read-only, that $x=3 would croak.  When shared hash key scalars came along, this bit gained a new meaning, that the string buffer is read-only, even if the scalar itself is modifiable.  The result was that many XS modules ended up being subtly wrong.

I did an audit of XS modules using SvREADONLY (which, admittedly, may not have been as exhaustive as I had hoped).  In most cases, XS modules were erroneously rejecting scalars which (from Perl space at least) were not read-only.  By stopping COW from using that flag, I automatically fixed those.  Of the remaining modules, most were buggy either way and would need to be fixed regardless of whether perl used SvREADONLY for COW.

So, in short, I think HTML::Entities needs to be corrected.

We need to encourage people to use SvPV_force or sv_force_normal before assuming that a PV is modifiable.


Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About