develooper Front page | perl.perl5.porters | Postings from September 2018

[perl #133511] [PATCH] tiny optimization of pp_divide

Thread Next
From:
Tomasz Konojacki
Date:
September 12, 2018 05:53
Subject:
[perl #133511] [PATCH] tiny optimization of pp_divide
Message ID:
rt-4.0.24-23967-1536731566-1640.133511-75-0@perl.org
# New Ticket Created by  Tomasz Konojacki 
# Please include the string:  [perl #133511]
# in the subject line of all future correspondence about this issue. 
# <URL: https://rt.perl.org/Ticket/Display.html?id=133511 >


This patch marginally speed-ups division of large integers in perl. The
change is extremely simple and explained in the commit message:

---
    pp_divide: use modulo instead of multiplication

    On most architectures with hardware integer division (like
    x86 or aarch64), division instruction returns both the remainder
    and the quotient. It means that performing modulo operation
    immediately after division using the same operands is 100% free.

    Essentially this commit changes "div" and "mul" into a single "div"
    instruction, which results in minor speed-up.
---

For more information, you can take a look at compiler explorer output:

multiplication version: https://gcc.godbolt.org/z/0yXBij
modulo version: https://gcc.godbolt.org/z/OMy2Ok

Here are the results of Porting/bench.pl with my patch, compared to
blead:

Key:
    Ir   Instruction read
    Dr   Data read
    Dw   Data write
    COND conditional branches
    IND  indirect branches
    _m   branch predict miss
    _m1  level 1 cache miss
    _mm  last cache (e.g. L3) miss
    -    indeterminate percentage (e.g. 1/0)

The numbers represent relative counts per loop iteration, compared to
blead at 100.0%.
Higher is better: for example, using half as many instructions gives 200%,
while using twice as many gives 50%.

large_int_with_int_result
288230376151711744 / 2

        blead my_patch
       ------ --------
    Ir 100.00   101.39
    Dr 100.00   100.00
    Dw 100.00   100.00
  COND 100.00   100.00
   IND 100.00   100.00

COND_m 100.00   100.00
 IND_m 100.00   100.00

 Ir_m1 100.00   100.00
 Dr_m1 100.00   100.00
 Dw_m1 100.00   100.00

 Ir_mm 100.00   100.00
 Dr_mm 100.00   100.00
 Dw_mm 100.00   100.00

large_int_with_float_result
288230376151711744 / 3

        blead my_patch
       ------ --------
    Ir 100.00   101.34
    Dr 100.00   100.00
    Dw 100.00   100.00
  COND 100.00   100.00
   IND 100.00   100.00

COND_m 100.00   100.00
 IND_m 100.00   100.00

 Ir_m1 100.00   100.00
 Dr_m1 100.00   100.00
 Dw_m1 100.00   100.00

 Ir_mm 100.00   100.00
 Dr_mm 100.00   100.00
 Dw_mm 100.00   100.00

small_int
4 / 2

        blead my_patch
       ------ --------
    Ir 100.00   100.00
    Dr 100.00   100.00
    Dw 100.00   100.00
  COND 100.00   100.00
   IND 100.00   100.00

COND_m 100.00   100.00
 IND_m 100.00   100.00

 Ir_m1 100.00   100.00
 Dr_m1 100.00   100.00
 Dw_m1 100.00   100.00

 Ir_mm 100.00   100.00
 Dr_mm 100.00   100.00
 Dw_mm 100.00   100.00

float
3.0 / 2

        blead my_patch
       ------ --------
    Ir 100.00   100.00
    Dr 100.00   100.00
    Dw 100.00   100.00
  COND 100.00   100.00
   IND 100.00   100.00

COND_m 100.00   100.00
 IND_m 100.00   100.00

 Ir_m1 100.00   100.00
 Dr_m1 100.00   100.00
 Dw_m1 100.00   100.00

 Ir_mm 100.00   100.00
 Dr_mm 100.00   100.00
 Dw_mm 100.00   100.00

AVERAGE

        blead my_patch
       ------ --------
    Ir 100.00   100.68
    Dr 100.00   100.00
    Dw 100.00   100.00
  COND 100.00   100.00
   IND 100.00   100.00

COND_m 100.00   100.00
 IND_m 100.00   100.00

 Ir_m1 100.00   100.00
 Dr_m1 100.00   100.00
 Dw_m1 100.00   100.00

 Ir_mm 100.00   100.00
 Dr_mm 100.00   100.00
 Dw_mm 100.00   100.00


Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About