develooper Front page | perl.perl5.porters | Postings from October 2017

Re: Using __builtin_add_overflow and friends for overflow checking

Thread Previous | Thread Next
From:
Dave Mitchell
Date:
October 24, 2017 08:29
Subject:
Re: Using __builtin_add_overflow and friends for overflow checking
Message ID:
20171024082921.GO3083@iabyn.com
On Mon, Oct 23, 2017 at 11:34:16PM +0200, Lukas Mai wrote:
> I think it would be a good idea to use compiler intrinsics for overflow
> checks where available.
> 
> I've pushed a branch that implements this at mauke/overflow:
> https://perl5.git.perl.org/perl.git/shortlog/refs/heads/mauke/overflow
> 
> Dave: I've CC'd you directly because you last worked on this code (commit
> 230ee21f3e366901ce5769d324124c522df7ce8a, "faster add, subtract, multiply").
> 
> My changes affect pp_add, pp_subtract, and pp_multiply. I think the new code
> is nicer because it's easier to understand than all the low-level bit
> fiddling, and it passes all tests on my machine. However, I haven't done any
> benchmarks to see how it affects performance (if at all).
> 
> Things I need help with:
> 
> - code review

Note that Reini did something similar in cperl, although he disabled
my short-cut code in the presence of __builtin_mul_overflow etc, which was
a mistake (and is why the nbody benchmark runs about 30% faster on perl
compared with cperl - at least last time I looked).

At a cursory inspection the code looks good (although I haven't looked
closely at the main body (non-shortcut) part of the code.

Note that your '#ifdefs' probably need indenting with '#  ifdef'
in some places since there's already ifdefs surrounding the code.

> - benchmarks (compared to a17768d7c7b82c136fbeacd85db3451973a8007a)

Are you familiar with Porting/bench.pl and t/perf/benchmarks?

I tried running it as follows:

I had 3 executables:

    /tmp/perl-a1776 - just before your 3 commits
    /tmp/perl-of    - the tip of your branch
    /tmp/perl-no-of - ditto with this diff applied:

    $ diff -u config.h- config.h
    +#if 0
     #define HAS_BUILTIN_ADD_OVERFLOW	/**/
     #define HAS_BUILTIN_SUB_OVERFLOW	/**/
     #define HAS_BUILTIN_MUL_OVERFLOW	/**/
    +#endif

run the arith subset of the benchmarks against the 3 perls and write the
results to a file: use 8 CPUs in parallel:

$ perl Porting/bench.pl -w /tmp/bm_num -j 8 --tests=/expr::arith::/ -v \
        /tmp/perl-a1776 /tmp/perl-no-of /tmp/perl-of

read the results back and display them sorting by number of conditional
branches in the right-most column:

$ perl Porting/bench.pl -r /tmp/bm_num --sort=COND:-1 > /tmp/bm_num.out

You can similarly do --sort=Ir:-1 etc

On my hardware, this shows that the no-builtins build shows no slowdown
(good!) and the builtins build shows a modest improvement in the number
instruction reads and/or conditional branches (again, good).

Here are a couple the best results:

    expr::arith::add_lex_ii
    add two integers and assign to a lexical var

           /tmp/perl-a1776 /tmp/perl-no-of /tmp/perl-of
           --------------- --------------- ------------
        Ir          100.00          100.00       106.19
        Dr          100.00          100.00       100.00
        Dw          100.00          100.00       100.00
      COND          100.00          100.00       100.00
       IND          100.00          100.00       100.00

    expr::arith::add_lex_ss
    add two short strings and assign to a lexical var

           /tmp/perl-a1776 /tmp/perl-no-of /tmp/perl-of
           --------------- --------------- ------------
        Ir          100.00          100.00       101.96
        Dr          100.00          100.00       100.00
        Dw          100.00          100.00       100.00
      COND          100.00          100.00       104.30
       IND          100.00          100.00       100.00

And here's the average. It includes many non add/sub/mult benchmarks, which
dilutes the numbers. 

    AVERAGE

           /tmp/perl-a1776 /tmp/perl-no-of /tmp/perl-of
           --------------- --------------- ------------
        Ir          100.00          100.00       101.01
        Dr          100.00          100.00       100.00
        Dw          100.00          100.00       100.00
      COND          100.00          100.00       100.22
       IND          100.00          100.00       100.00

-- 
I before E. Except when it isn't.

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About