On 01/30/2013 07:33 PM, Dave Mitchell wrote: > Yeah. Just to be clear, I was pointing out the difficulties of automatic > profiling: I expect it would usually be obvious when to apply UNLIKELY etc > by hand. That's an interesting point of view. Many a potential big win would be based on knowledge of relative frequency of occurrence of SV types. That isn't at all obvious. Some things are: SvMAGICAL could almost always be wrapped in UNLIKELY -- unless we're already in a branch that treats a special case. This shows that those decisions are highly context sensitive and thus aren't necessarily all that beginner-friendly. Furthermore, I think that the potentially biggest wins could be gotten from just a couple of places, some of which are strategic. For example, Rafael did a fair amount of profiling at work on our main code base recently and found that the most time (single-most, not overall) was spent in UTF8 and other string-concat related functions. Thus, modifying SvGROW, for example # define SvGROW(sv,len) (SvLEN(sv) < (len) ? sv_grow(sv,len) : SvPVX(sv)) to read # define SvGROW(sv,len) (UNLIKELY(SvLEN(sv) < (len)) ? sv_grow(sv,len) : SvPVX(sv)) and relying on the fact that perl does aggressive geometric growth of strings MAY be quite beneficial[1]. Furthermore for UTF8 handling, there's lots of loops over characters, checking if (UNI_IS_INVARIANT(uv)) which is: /* Is the representation of the Unicode code point 'c' the same regardless of * being encoded in UTF-8 or not? */ #define UNI_IS_INVARIANT(c) (((UV)c) < 0x80) Is it a fair assumption to think that most characters we deal with are < 0x80? For the code I write, I'm pretty sure that yes, most characters are UNI_IS_INVARIANT (yay, ASCII). But is it reasonable to discriminate this way? If so, that could be a big win. Another fun one: What about UNI_SKIP? #define UNISKIP(uv) ( (uv) < 0x80 ? 1 : \ (uv) < 0x800 ? 2 : \ (uv) < 0x10000 ? 3 : \ (uv) < 0x200000 ? 4 : \ (uv) < 0x4000000 ? 5 : \ (uv) < 0x80000000 ? 6 : \ (uv) < UTF8_QUAD_MAX ? 7 : 13 ) There's easy cases, too: Anything that does if(...)croak() could be considered unlikely because of the relative cost of exceptions. Anything taint related could be considered unlikely. Another judgement call: Do we want to slightly pessimize the already-slow taint logic and potentially speed up normal code execution when taint is off (but not compiled out)? I think yes, but it would be perfectly valid to disagree. Finally, looking at what I think is a very hot function: SV *Perl_newSV(pTHX_ const STRLEN len) One could argue that if(len) should be unlikely. Would it be beneficial to add a separate function that only allocates a new SV without checking whether we should reserve string space? It seems to me like the majority of SVs aren't born as strings, so that could be a similar change as SvREFCNT_dec_NN in that it saves one branch in very hot code. --Steffen [1] Which makes me wonder whether gcc would make the same assumptions about ternaries as it does with if(){}. Presumably yes.Thread Previous | Thread Next