Front page | perl.perl5.porters |
Postings from January 2001
Re: [PATCH] lvalue hash and array elements
Thread Previous
|
Thread Next
From:
Stephen McCamant
Date:
January 3, 2001 23:31
Subject:
Re: [PATCH] lvalue hash and array elements
Message ID:
14932.3850.554107.156229@soda.csua.berkeley.edu
>>>>> "SC" == Simon Cozens <simon@cozens.net> writes:
SC> On Wed, Jan 03, 2001 at 03:15:14PM -0800, Stephen McCamant wrote:
SMcC> In a case like this, the important thing to think about is what
SMcC> decisions can be made at compile time (i.e. in op.c) rather than
SMcC> at run time (pp*.c). Your LVRET mixes up static compile time
SMcC> decisions (based on op_next and op_next->op_type) and dynamic
SMcC> runtime ones (the stuff with cxstack).
SC> I've thought long and hard about this, and spent quite a few days
SC> on it. You CANNOT tell at compile time whether an op (say, aelem)
SC> would be used as a return value. You really can't do it. Believe
SC> me. I have pictures of op trees papered all over my walls. There's
SC> nothing that distinguishes a return value op from something that
SC> can't be used as a return value. Try it.
[...]
SC> You *have* to do this at runtime. And at runtime, something is
SC> being used as a return value if i) its next op is "return" or ii)
SC> its next op is "leavesub" or (in the case we're interested in)
SC> "leavesublv". And that's exactly what I'm catching.
My basic point is that any test you might do on the OP tree, no matter
how complicated, is faster if you do it once at compile time than if
you do it at runtime. op_next doesn't change at runtime, and it look
into the future to see which OP will *really* be executed next; it's
just a pointer that's computed at compile time and often used to
decide what OP to go to next. You don't get any additional flexibility
by looking at it at runtime, you just repeatedly test something that
always gives the same answer. Right now, you're testing whether `at
runtime, op_next points to an OP_RETURN or an OP_LEAVESUBLV'. Since
op_next doesn't change at runtime, this is the same as testing whether
`after compiling, op_next points to an OP_RETURN or an OP_LEAVESUBLV'.
Since peep() is the last phase of the compile, you can do the check
there; to be most paranoid, I think you'd want to put a check in the
cases for OP_RETURN and OP_LEAVESUBLV that looked like:
if (oldop->op_type == OP_AELEM || oldop->op_type == OP_HELEM) {
oldop->op_private |= OPpMAYBE_LVALUE;
}
(this assumes you dealt with aelemfast separately). Then LVRET would
be `((PL_op->op_private & OPpMAYBE_LVALUE) && (cxstack ...))'. Or, it
might be a little faster to write the test is pp_aelem as:
U32 lval;
if (PL_op->op_flags & OPf_MOD) {
if (PL_op->op_private & OPpMAYBE_LVALUE)
lval = cxstack...;
else
lval = 1;
}
and turn on OPf_MOD too in peek(); then there would be no extra test
for the unambiguously-rvalue case, which is probably at least 50% of
the time.
SMcC> If I'm reading the grouping right, you might also have
SMcC> correctness problems -- it looks like you only do the runtime
SMcC> check before a OP_RETURN, and not before an OP_LEAVESUBLV but
SMcC> this means that every call to a sub with an lvalue attribute
SMcC> treats its last expression in an lvalue context, even if the sub
SMcC> isn't called in one, which could lead for instance to unintended
SMcC> hash vivifications.
SC> So you're saying the code should be:
SC> #define LVRET (PL_op->op_next
&& (PL_op->op_next->op_type == OP_LEAVESUBLV || \
SC> (PL_op->op_next->op_type == OP_RETURN) && ( \
SC> (&cxstack[cxstack_ix])->blk_sub.lval && \
SC> CvLVALUE((&cxstack[cxstack_ix])->blk_sub.cv))))
SC> ?
Mind if I abbreviate some?
NEXT = PL_op->op_next
LVSB = PL_op->op_next->op_type == OP_LEAVESUBLV
RET = PL_op->op_next->op_type == OP_RETURN
RUNT = ((&cxstack[cxstack_ix])->blk_sub.lval &&
CvLVALUE((&cxstack[cxstack_ix])->blk_sub.cv))
Your original version was
(NEXT && (LVSB || (RET && RUNT)))
Above, you have
(NEXT && (LVSB || (RET) && RUNT))
What I think you need to be testing for correctness is
((NEXT && (LVSB || RET)) && RUNT
SC> I think I could believe that.
SC> Could you provide an example of a test failure which that would
SC> fix?
I haven't actually compiled your code, but I was thinking of something
like
sub f :lvalue { $h{foo} };
print f(); # undef
print exists $h{foo} ? "not ok\n" : "ok\n";
SMcC> All that you can do at compile time is figure out which ops
SMcC> might be the return values of lvalue subs
SC> That is *all* you can do. And you can only tell which *might* be
SC> return values. And usually not even that. *Any* op might be the
SC> return value. Trying to figure it out at compile time *will not
SC> work*.
SMcC> , but you if only look at the context stack for them, the
SMcC> overhead for non-lvalue programs would be at most a single flag
SMcC> check (or zero, if you made separate pp functions).
SC> Are you sure you mean this? I *am* looking at the context stack
SC> for them.
I think I ran afoul of some ambiguities in English there. I'm saying
that if the only OPs for which you looked at the context stack were
OPs that might be the return values of lvalue subs, as best as that
can be determined at compile time (in particular, OPs in functions
that aren't declared with :lvalue would certainly be excludable at
compile time), then you could achieve an implementation with little or
no overhead for programs that didn't use lvalue subs. My understanding
of what you're doing at the moment is that you're not only looking at
the context stack for OPs that could be the return value of an
lvalue subroutine, but you're also look at the context stack for OPs
that could not be the return value of an lvalue sub, such as an
OP_AELEM return value of a non-lvalue sub.
SC> Try finding a way to
SC> programmatically find out what will be the return value from this,
SC> at compile time:
SC> sub foo :lvalue { if (time) { return $x } else { ; time ? $z =
SC> $y{$x} : $z } }
SC> Even finding the *possible* return values just by looking at the
SC> tree is highly non-trivial, and I haven't even used goto yet. And
SC> *that*'s a simple subroutine.
I think the best way to think about this is analogous to the way
scalar and list context are determined, by a recursive procedure that
walks down the OP tree. Consider the following rules:
1. If a function is :lvalue, its body could be an lvalue return.
2. If return() occurs in an :lvalue function, its argument could be an
lvalue return.
3. If a block {S1; S2; .. Sn;} could be an lvalue return, Sn could be
an lvalue return.
4. If A ? B : C could be an lvalue return, B and C could be lvalue
returns.
5. If if(A) B else C could be an lvalue return, B and C could be
lvalue returns (this is really the same as the last rule).
6. If A = B could be an lvalue return, then A could be an lvalue
return.
7. If a variable, an OP_AELEM, and OP_HELEM, .... could be an lvalue
return, mark it for special treatment.
When I apply these rules, what I end up marking in foo are the first
$x and both of the $z's. Are these what you had been thinking of as
the return values of foo? (If not, what we should be discussing is
what needs to be marked). (To tie this theoretical discussion in with
the implementation strategy I was mentioning before, 1 represents a
call to mod(block, OP_LEAVESUBLV) in newATTRSUB(), 2 would be a call
to mod() in ck_return(), 3-6 are covered by pre-existing code in
mod(), and 7 would be where you set OPpMAYBE_LVALUE).
I don't think gotos actually make it any more complicated, since no
return value can persist over a goto -- the only destination a goto
can take you to is a nextstate (since nextstates are where the labels
are), and nextstates always clear off the stack. If you goto the place
after the last statement of your sub, the effect is to just return
nothing.
-- Stephen
Thread Previous
|
Thread Next