develooper Front page | perl.perl5.porters | Postings from October 2017

Re: Using __builtin_add_overflow and friends for overflow checking

Thread Previous
From:
Lukas Mai
Date:
October 24, 2017 19:52
Subject:
Re: Using __builtin_add_overflow and friends for overflow checking
Message ID:
8971e1db-4949-f8e3-4e81-be637ba78629@gmail.com
Am 24.10.2017 um 10:29 schrieb Dave Mitchell:
> On Mon, Oct 23, 2017 at 11:34:16PM +0200, Lukas Mai wrote:
>> I think it would be a good idea to use compiler intrinsics for overflow
>> checks where available.
>>
>> I've pushed a branch that implements this at mauke/overflow:
>> https://perl5.git.perl.org/perl.git/shortlog/refs/heads/mauke/overflow
>>
>> Dave: I've CC'd you directly because you last worked on this code (commit
>> 230ee21f3e366901ce5769d324124c522df7ce8a, "faster add, subtract, multiply").
>>
>> My changes affect pp_add, pp_subtract, and pp_multiply. I think the new code
>> is nicer because it's easier to understand than all the low-level bit
>> fiddling, and it passes all tests on my machine. However, I haven't done any
>> benchmarks to see how it affects performance (if at all).
>>
>> Things I need help with:
>>
>> - code review
> 
> Note that Reini did something similar in cperl, although he disabled
> my short-cut code in the presence of __builtin_mul_overflow etc, which was
> a mistake (and is why the nbody benchmark runs about 30% faster on perl
> compared with cperl - at least last time I looked).
> 
> At a cursory inspection the code looks good (although I haven't looked
> closely at the main body (non-shortcut) part of the code.
> 
> Note that your '#ifdefs' probably need indenting with '#  ifdef'
> in some places since there's already ifdefs surrounding the code.

I didn't do that because some of the existing code didn't indent its 
nested #ifdefs either.

>> - benchmarks (compared to a17768d7c7b82c136fbeacd85db3451973a8007a)
> 
> Are you familiar with Porting/bench.pl and t/perf/benchmarks?

No. Every time I run cachegrind, it fails immediately with "unhandled 
instruction bytes 0x67 0xE8 0xD3 0x8B" or similar and produces a 
vgcore.NNN dump file.


> 
> On my hardware, this shows that the no-builtins build shows no slowdown
> (good!) and the builtins build shows a modest improvement in the number
> instruction reads and/or conditional branches (again, good).

Cool. That's what I was hoping for.

-- 
Lukas Mai <plokinom@gmail.com>

Thread Previous


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About