Front page | perl.perl5.porters |
Postings from July 2017
Re: [perl #78288] ref and other ops are inefficient in booleancontext
Thread Previous
|
Thread Next
From:
Dave Mitchell
Date:
July 27, 2017 10:50
Subject:
Re: [perl #78288] ref and other ops are inefficient in booleancontext
Message ID:
20170727105029.GH3091@iabyn.com
On Fri, Jan 06, 2017 at 04:44:27PM +0000, Dave Mitchell wrote:
> I've had a quick look though our ops list to see what other ops might
> benefit from special-case handling in boolean context. There aren't
> actually all that many. OP_KEYS and OP_VALUES could probably be tidied up a
> bit and length($utf8_string) would probably benefit from not needing to
> do a bytes -> chars length conversion. Other than one or two other minor
> cases, not a lot leaped out at me. Its possible that ops which return
> integer values might be able to do it more efficiently if they just return
> PL_sv_yes/no rather than having to set a PADTMP to an integer value.
>
> Anyway, I'll look further into doing other ops after 5.26.
Which I've now done and just pushed with this merge commit:
commit c1a6686e7b19b19f65ba89a90c0f0bf57606197f
Merge: 0283ad9 cd5acdd
Author: David Mitchell <davem@iabyn.com>
AuthorDate: Thu Jul 27 11:30:50 2017 +0100
Commit: David Mitchell <davem@iabyn.com>
CommitDate: Thu Jul 27 11:30:50 2017 +0100
[MERGE] various boolean-related optimisations
This branch contains about 50 commits, which collectively optimise
various aspects of perl's behaviour when detailing with boolean values
or ops that are called in boolean context.
The main changes are:
* A &PL_sv_zero variable has been added. This is a new per-interpreter
immortal SV, very similar to &PL_sv_no, except that it has a string value
of "0" rather than "". As well as being directly usable in cases where
code might otherwise need to do newSViv(0), it has a more subtle use in
ops that handle boolean context directly. For example in
sub f {
....;
if (%h) { .... }
}
the 'if' statement is compiled using OP_AND, so is equivalent to
%h && do { .... }
If %h is empty, then the result of the boolean expression should be 0
rather than &PL_sv_no, and this value gets returned to the caller, which
may expect a scalar result: and what it expects won't be known until run
time. So by returning &PL_sv_yes and &PL_sv_zero rather than yes and no,
we increase the number of places where it is safe to return a boolean
value.
A downside of &PL_sv_zero is that if assigned to a variable, that variable
gets int, num and string values rather than just an int value.
* SvTRUE() is now more efficient.
This macro is called in places like pp_and, pp_not etc. It has a long list
of conditions which it goes through to determine the truthiness of an SV,
such as whether it has a string value, and if so whether the length is
zero, or the length is 1 and the string's value is "0". It turns out that
the immortals like &PL_sv_yes fare really badly here: they have to go
through nearly every check to finally determine their value. To get round
this, I have made it very quick to check whether an SV is one of the
immortals, and if so whether it is true. This has been done by ensuring
that PL_sv_undef, PL_sv_no, PL_sv_zero and PL_sv_yes are all contiguous in
memory, so that a quick single address comparison is enough to determine
immortality, and then comparing the address against &PL_sv_yes is enough
to determine whether its true.
In particular in non-multiplicity builds, PL_sv_undef etc have been
replaced with the array PL_sv_immortals[4], with PL_sv_undef #defed to
PL_sv_immortals[0] etc.
Also, the SvOK() macro has been made more efficient by restoring the POK
flag on REGEXP svs and and PVLVs which hold a regex. This removes the two
extra checks that SvOK() had to do each time. This has been done by
changing the way that PVLV's-holding-a-regex are implemented. The downside
of this change is that ReANY() now includes a single conditional. To
ameliorate that, places like pp_match() have been tweaked to only fetch
ReANY() once where possible.
* the OP_KEYS op is now optimised away in void and scalar context.
Since a hash in scalar context was changed so that it no longer returns a
bucket count but instead just a key count, '%h' and 'keys %h' in
void/boolean/scalar context are now very similar. So for 'keys %h', rather
than calling pp_padhv+pp_keys, just call pp_padhv with a OPpPADHV_ISKEYS
flag set. Similarly for pp_rv2hv. As well as skipping an extra op call,
this brings the existing boolean-context optimisations of '%h' to 'keys
%h' too. In particular, 'keys %tied' in boolean context now calls SCALAR()
if available, or FIRSTKEY() otherwise, rather than iterating through the
whole hash.
I have also given OP_RV2HV a targ so that it can return integer values
more efficiently.
* Various integer-returning ops are now flagged when in boolean context,
which means at runtime they can just return &PL_sv_yes/&PL_sv_zero rather
than setting a targ to an integer value, or for ops without targs, having
to create a new integer-valued mortal. As well as being quicker to return
a value, this works well with SvTRUE() which now recognises immortals
quickly. Also for ops like length() and pos(), it doesn't need to convert
between byte and char offsets; the fact that the offset is non-zero is
sufficient.
These ops are:
OP_AASSIGN
OP_GREPWHILE
OP_LENGTH
OP_PADAV
OP_POS
OP_RV2AV
OP_SUBST
Also, index() doesn't return a boolean value, but for no match it returns
-1. So for code like
if (index(...) != -1) { ... }
optimise away the OP_CONST and the OP_EQ and flag the index op to return a
boolean value.
* Speed up OP_ITER
OP_ITER is called for every iteration of a for loop or similar. Its job is
iterate the loop variable once, then return &PL_sv_yes or &PL_sv_no
depending on whether it's the last iteration. OP_ITER is always followed
by OP_AND, which examines the truth value on the stack, and returns
op_next or op_other accordingly. Now, pp_iter() just asserts that
PL_op->op_next is an OP_AND, and returns PL_op->op_next->op_next or
PL_op->op_next->op_other directly, skipping the PL_sv_yes/no push/pop and
eliminating the call to pp_and().
As part of these changes, I have moved pp_padav(), pp_padhv() from pp.c
to pp_hot.c, moved some common code into a new function
S_padhv_rv2hv_common(), created a new (non-API) function Perl_hv_pushkv()
which pushes a hash's keys or values or both onto the stack, and reduced
the number of callers of Perl_do_kv() (which was acting as both a pp
function for several ops and as a general-purpose function too).
Of the 360 or so tests in t/perf/benchmarks, the following number of
tests had their COND field changed from 100% to the following ranges:
36 @ 96.55% .. 99.99%
245 @ 100.00% .. 100.99%
28 @ 101.00% .. 109.99%
7 @ 110.00% .. 119.99%
10 @ 120.00% .. 129.99%
29 @ 130.00% .. 199.99%
4 @ 200.00% .. 299.99%
1 @ 314.29%
so about 10% of tests became marginally slower - usually due to one extra
conditional in an op to test for a private BOOL flag or ReANY(); about 70%
of tests were almost unaffected, while 20% of tests showed improvement,
most with considerable improvement, and a few with spectacular improvement.
(The 314% is for an empty @lexical tested in boolean context).
--
"I do not resent criticism, even when, for the sake of emphasis,
it parts for the time with reality".
-- Winston Churchill, House of Commons, 22nd Jan 1941.
Thread Previous
|
Thread Next