develooper Front page | perl.perl5.porters | Postings from February 2012

Re: [perl #109726] PL_sv_undef loses identity

Thread Previous | Thread Next
Nicholas Clark
February 9, 2012 06:13
Re: [perl #109726] PL_sv_undef loses identity
Message ID:
I've been thinking about this for a bit. I've just remembered an
important part. Right now, PL_sv_placeholder is in perlvars.h, *not*

/* Restricted hashes placeholder value.
 * The contents are never used, only the address. */
PERLVAR(G, sv_placeholder, SV)

It's a global, not per-interpreter. If I understand things correctly, the
address &PL_sv_placeholder is never actually *returned* anywhere - it's
used to compare against. That means that nothing can fiddle with its
reference count, or accidentally upgrade it, bless it, or anything else.

(I think that Juerd did a lightning talk on undef abuse that included
demonstrating how to bless PL_sv_undef)

Whilst that can be solved by making it per-interpreter, I'm not sure what
interesting side effects will be knock ons from there. eg

On Mon, Feb 06, 2012 at 12:59:33PM +0000, Zefram wrote:
> Dave Mitchell wrote:
> >fetched in lvalue context (e.g. \$_[0]), av_fetch() converts PL_sv_undef
> >int the slot into a real SV:
> It's another special-case use of PL_sv_undef to mean something
> other than the Perl-visibile standard undef value.  Another case for
> PL_sv_placeholder?

This would mean that &PL_sv_placeholder has the potential to be returned
by APIs that didn't use to. Whilst this is sort of the same problem as
PL_sv_undef propagating further than we might like

a) &PL_sv_undef is well known to XS code that wants to care about it
b) Accidentally propagating PL_sv_placeholder onwards to places where it
   shouldn't be (eg in the front door of the hash APIs) will cause
   "interesting" bugs.

   Historically, PL_sv_placeholder was introduced for 5.8.1. 5.8.0 added
   restricted hashes (gah) and the placeholders, but used PL_sv_undef for
   the placeholder value. Which broke existing working XS code which stored
   PL_sv_undef in hashes, mostly as a token value where the key was

On Fri, Dec 30, 2011 at 01:40:12PM -0800, Father Chrysostomos via RT wrote:

> There is also this interesting bit:
> 	    else if (o->op_type != OP_METHOD_NAMED
> 		&& cSVOPo->op_sv == &PL_sv_undef) {
> 		/* PL_sv_undef is hack - it's unsafe to store it in the
> 		   AV that is the pad, because av_fetch treats values of
> 		   PL_sv_undef as a "free" AV entry and will merrily
> 		   replace them with a new SV, causing pad_alloc to think
> 		   that this pad slot is free. (When, clearly, it is not)
> 		*/
> 		SvOK_off(PAD_SVl(ix));
> 		SvPADTMP_on(PAD_SVl(ix));
> 		SvREADONLY_on(PAD_SVl(ix));
> I don't believe &PL_sv_placeholder is ever accessible to Perl, so using
> that instead of &PL_sv_undef in pad.c would allow XS modules to provide
> an OP_CONST that actually returns &PL_sv_undef.

This is still somewhat wooly in my head, but there are several levels in the
hierarchy of what AVs are used for

0:   !AvREAL() && !AvREIFY() just the stacks? [and @DB::args currently]
1:   !AvREAL() && AvREIFY()  @_ [and @DB::args should be here]
2:   Pads
2.5: other AVs seen only by C and XS code that can have AVs, HVs etc directly
     in them
3:   AVs visible to Perl code - ie via a reference, and containing only
     scalars and *references* to nested containers

As far as can work out, the stacks store garbage rather than any sort of
pointer for empty parts. From this:

	    else {
		newmax = key < 3 ? 3 : key;
		MEM_WRAP_CHECK_1(newmax+1, SV*, oom_array_extend);
		Newx(AvALLOC(av), newmax+1, SV*);
		ary = AvALLOC(av) + 1;
		tmp = newmax;
		AvALLOC(av)[0] = &PL_sv_undef;	/* For the stacks */
	    if (AvREAL(av)) {
		while (tmp)
		    ary[--tmp] = &PL_sv_undef;

I still don't understand that part commented "For the stacks". Why does
the first element, but the first element *only* need to be a valid pointer?

Anyway, I suspect that there's low risk of problems elsewhere if the Pads
switched to using PL_sv_placeholder for a free slot, as they are only
supposed to be accessed via our APIs, which would never actually *return*
it. And Pads are already somewhat insane with their manual reference
counting fakery.

I'm concerned that there would be odd strange breakages if arrays (more
generally) started using PL_sv_placeholder because it would start escaping.

But I suspect the only way to get a handle on *that* would be to tweak the
core and then smoke CPAN against it.

Nicholas Clark

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About