develooper Front page | perl.perl5.porters | Postings from February 2013

NWCLARK TPF grant report #71

Thread Next
From:
Nicholas Clark
Date:
February 8, 2013 13:16
Subject:
NWCLARK TPF grant report #71
Message ID:
20130208131614.GG5656@plum.flirble.org
[Hours]		[Activity]
2013/01/07	Monday
 1.00		POSIX::strptime
 0.75		STRANGE_MALLOC
 1.00		hashing
 1.00		process, scalability, mentoring
 4.50		reading/responding to list mail
=====
 8.25

2013/01/08	Tuesday
 6.75		reading/responding to list mail
=====
 6.75

2013/01/09	Wednesday
 6.50		reading/responding to list mail
=====
 6.50

2013/01/10	Thursday
 6.75		clang
 0.50		module evictions
=====
 7.25

2013/01/11	Friday
 0.75		clang
 0.50		ext/B/t/optree_misc.t
 1.25		process, scalability, mentoring
 6.50		regcomp/setjmp
 0.50		smoke-me/threads-shared-stress
=====
 9.50

2013/01/12	Saturday
 1.00		investigating security tickets
 0.75		process, scalability, mentoring
 2.25		regcomp/setjmp
 0.25		timely destruction
=====
 4.25

Which I calculate is 42.50 hours

After catching up with some of the e-mail backlog, I tried having another
prod at the awkward clang/ASAN problem described last month.

It turned out that I missed something. Key to replicating the problem was that
one must turn on clang's optimiser. Of course, trying to debug a problem in the
C code, I'd gone off down the path of trying debugging builds, ie -g, which
by default don't enable the optimiser. (Debugging code with the optimiser
enabled tends to screw with your head, because the debugger reports execution
jumping backwards and forwards.) I hadn't tried compiling with -O on a local
machine, and that *does* replicate the problem. Better still, on that machine,
-O -g together still replicates the problem. So, now I can stick gdb on it:

    1..31
    ok 1 - $^H{foo} doesn't exist initially
    ok 2 - $^H doesn't contain HINT_LOCALIZE_HH initially
    ok 3 - $^H{foo} is now 'a'
    ok 4 - $^H contains HINT_LOCALIZE_HH while compiling
    ok 5 - $^H{foo} is now 'b'
    ok 6 - $^H{foo} restored to 'a'
    ok 7 - $^H{foo} doesn't exist while finishing compilation
    ok 8 - $^H doesn't contain HINT_LOCALIZE_HH while finishing compilation
    
    Breakpoint 1, 0x000000000042e440 in __asan_report_error ()
    (gdb) up
    #1  0x000000000042f7a7 in __asan_report_load8 ()
    (gdb)
    #2  0x00000000004fa030 in Perl_re_op_compile (patternp=<optimized out>,
        pat_count=-1292, expr=0xffffffffb6e, eng=<optimized out>, old_re=0x0,
        is_bare_re=0x7fffffffdba4, pm_flags=<optimized out>,
        orig_rx_flags=<optimized out>) at regcomp.c:5634
    5634        if (   old_re
    (gdb) p old_re
    $1 = (REGEXP * volatile) 0x0
    (gdb) p &old_re
    Address requested for identifier "old_re" which is in register $r8


Odd. How come a volatile variable thinks that it's in a register?
And it's an LVALUE, so why can't I take its address? What does the compiler
think - let's try some "printf" debugging:

    diff --git a/regcomp.c b/regcomp.c
    index a8b27dc..342e772 100644
    --- a/regcomp.c
    +++ b/regcomp.c
    @@ -5631,6 +5631,7 @@ Perl_re_op_compile(pTHX_ SV ** const patternp, int pat_cou
     
         /* return old regex if pattern hasn't changed */
 
    +    PerlIO_printf(PerlIO_stderr(), "%p %p\n", old_re, &old_re);
         if (   old_re
             && !recompile
            && !!RX_UTF8(old_re) == !!RExC_utf8


and we stop at the same place:

    1..31
    ok 1 - $^H{foo} doesn't exist initially
    ok 2 - $^H doesn't contain HINT_LOCALIZE_HH initially
    ok 3 - $^H{foo} is now 'a'
    ok 4 - $^H contains HINT_LOCALIZE_HH while compiling
    ok 5 - $^H{foo} is now 'b'
    ok 6 - $^H{foo} restored to 'a'
    ok 7 - $^H{foo} doesn't exist while finishing compilation
    ok 8 - $^H doesn't contain HINT_LOCALIZE_HH while finishing compilation
    0 7fffffffd760
    0 7fffffffd760
    
    Breakpoint 1, 0x000000000042e440 in __asan_report_error ()
    (gdb) up
    #1  0x000000000042f7a7 in __asan_report_load8 ()
    (gdb) up
    #2  0x00000000004fa060 in Perl_re_op_compile (patternp=<optimized out>, 
        pat_count=-1292, expr=0xffffffffb6e, eng=<optimized out>, 
        old_re=<optimized out>, is_bare_re=0x7fffffffdba4, 
        pm_flags=<optimized out>, orig_rx_flags=<optimized out>) at regcomp.c:5634
    5634        PerlIO_printf(PerlIO_stderr(), "%p %p\n", old_re, &old_re);
    (gdb) p old_re
    $2 = <optimized out>
    (gdb) p &old_re
    Can't take address of "old_re" which isn't an lvalue.

Can't take the address of it? That's despite the printf clearly showing that
its address is 0x7fffffffd760.

So something is clearly going badly wrong with the code generation here. It's
probably a bug in clang. But (a) it's not clear how to reduce it to a terser
test case (b) that's not actually helpful, as I really need to make this code
work on the clang we have here and now, as without this, we can't usefully
use ASAN any more to find bugs.

But some things started to come together. Whilst this specific failure is
probably a bug in clang, the function isn't exactly innocent. A couple of
people had already reported that they could get everything to pass if they
marked more variables as volatile. But playing whack-a-mole with volatile
is papering over the symptoms, not finding the cause. However, digging
further into the blame log for the function revealed a couple of previous
commits in the past few years adding volatile qualifiers, to fix compiler
warnings due to the addition of a all to setjmp(). So what's that all about?


The setjmp() relates to an optimisation added by commit bbd61b5ffb7621c2 in
September 2010. The backstory is that the compiled format for regular
expressions differs if it needs to store Unicode. (The documentation refers
to the human-readable form as the *pattern*, and the compiled form as the
*program*.) The Unicode representation for the program is larger, and cannot
be matched as efficiently, so it's preferable to use the tighter byte-based
format where possible. Unfortunately, it's not always possible to know for
sure which is the right decision until midway through parsing. If the
pattern contains literal Unicode, it's obvious that the program needs to
store Unicode. Otherwise, the parser optimistically assumes that the more
efficient representation can be used, and starts sizing on this
basis. However, if it then encounters something in the pattern which must be
stored as Unicode, such as an \x{...} escape sequence representing a
character literal, then this means that all previously calculated sizes need
to be redone, using values appropriate for the Unicode representation.

The problem is that the point where the parser discovers that it needs to
redo everything from the start is at least 4 functions down in the best
case, and possibly a lot further if the "problem" construction is within
parentheses groups. (The parser calls back to the top level to parse
parentheses.) Before the commit mentioned above, the recalculation was
implemented by setting a flag of "really, this needs Unicode", and simply
carrying on. The top level call spotted the flag, and restarted the parse.
Obviously, this is rather wasteful, particularly if the problem is detected
very early in the pattern.

So that commit refactored things so that the top level caller used setjmp()
to store a checkpoint, and upon detecting the need for a restart use
longjmp() to warp straight back to it. This skips doing needless work, which
is good. The problem is the *warp* bit - it's far more "warp" than "jump",
as the C compiler is not expecting the perfectly innocent looking setjmp()
function to return twice. C doesn't have continuations, so this doesn't
happen. But of course, it does, and it is breaking the rules, introducing
control flow that the compiler has no idea about, so you have to start
(effectively) lying to the compiler left right and centre, by declaring
variables volatile, forbidding it to do any (sane, reasonable, normal)
optimisation, so that the non-local control flow doesn't come unstuck. And
it's easy to miss one that matters, as the compiler doesn't warn.

So the question then became, can I implement the early return in some way
other than longjmp()? It sort of looked plausible from an initial look at
the functions - they only return NULL for special cases, and everywhere
checks the return value and propagates NULL onward. So hook into that.

All, that is, except this call to S_regbranch():

        if (is_define) 
            vFAIL("(?(DEFINE)....) does not allow branches");
        lastbr = reganode(pRExC_state, IFTHEN, 0); /* Fake one for optimizer. */
        regbranch(pRExC_state, &flags, 1,depth+1);

So is that a bug I've just found? Or something more subtle that doesn't
matter. This turned out to be something that took the better part of the next
week to figure out.

Nicholas Clark

Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About