Front page | perl.perl5.porters |
Postings from January 2020
Request for comments on commit129ccace6b45e3574c0b430b1fbcc7f8d0aa8e50, speed up grok_number
From:
Karl Williamson
Date:
January 18, 2020 20:01
Subject:
Request for comments on commit129ccace6b45e3574c0b430b1fbcc7f8d0aa8e50, speed up grok_number
Message ID:
45097e0d-5ef9-4762-da90-622ff38657a4@khwilliamson.com
I meant that to be a PR.
I sped it up as much as I could think of. In particular do you disagree
with any of my choices for LIKELY, UNLIKELY branch prediction.
Here's a link to the commit
diffs:https://github.com/Perl/perl5/commit/129ccace6b45e3574c0b430b1fbcc7f8d0aa8e50
The comparison from this vs blead in cachegrind is:
Key:
Ir Instruction read
Dr Data read
Dw Data write
COND conditional branches
IND indirect branches
_m branch predict miss
_m1 level 1 cache miss
_mm last cache (e.g. L3) miss
- indeterminate percentage (e.g. 1/0)
The numbers represent raw counts per loop iteration.
ten_digits
1256908743
blead switch Ratio %
----- ------ -------
Ir 817.0 800.0 102.1
Dr 242.0 241.0 100.4
Dw 141.0 142.0 99.3
COND 122.0 112.0 108.9
IND 11.0 11.0 100.0
COND_m 1.0 1.0 100.0
IND_m 7.0 7.0 100.0
Ir_m1 0.0 0.0 100.0
Dr_m1 0.0 0.0 100.0
Dw_m1 0.0 0.0 100.0
Ir_mm 0.0 0.0 100.0
Dr_mm 0.0 0.0 100.0
Dw_mm 0.0 0.0 100.0
nine_digits
124578902
blead switch Ratio %
----- ------ -------
Ir 801.0 737.0 108.7
Dr 240.0 225.0 106.7
Dw 140.0 134.0 104.5
COND 119.0 98.0 121.4
IND 11.0 11.0 100.0
COND_m 1.0 1.0 100.0
IND_m 7.0 7.0 100.0
Ir_m1 0.0 0.0 100.0
Dr_m1 0.0 0.0 100.0
Dw_m1 0.0 0.0 100.0
Ir_mm 0.0 0.0 100.0
Dr_mm 0.0 0.0 100.0
Dw_mm 0.0 0.0 100.0
negative_9_digits
-124578902
blead switch Ratio %
----- ------ -------
Ir 801.0 744.0 107.7
Dr 239.0 225.0 106.2
Dw 141.0 135.0 104.4
COND 118.0 100.0 118.0
IND 11.0 11.0 100.0
COND_m 1.0 1.0 100.0
IND_m 7.0 7.0 100.0
Ir_m1 0.0 0.0 100.0
Dr_m1 0.0 0.0 100.0
Dw_m1 0.0 0.0 100.0
Ir_mm 0.0 0.0 100.0
Dr_mm 0.0 0.0 100.0
Dw_mm 0.0 0.0 100.0
three_digits
123
blead switch Ratio %
----- ------ -------
Ir 735.0 687.0 107.0
Dr 234.0 220.0 106.4
Dw 134.0 128.0 104.7
COND 107.0 92.0 116.3
IND 11.0 12.0 91.7
COND_m 1.0 1.0 100.0
IND_m 7.0 7.0 100.0
Ir_m1 0.0 0.0 100.0
Dr_m1 0.0 0.0 100.0
Dw_m1 0.0 0.0 100.0
Ir_mm 0.0 0.0 100.0
Dr_mm 0.0 0.0 100.0
Dw_mm 0.0 0.0 100.0
three_digits_then_garbage
123foo
blead switch Ratio %
----- ------ -------
Ir 667.0 669.0 99.7
Dr 206.0 206.0 100.0
Dw 107.0 108.0 99.1
COND 100.0 95.0 105.3
IND 10.0 11.0 90.9
COND_m 0.0 0.0 100.0
IND_m 7.0 7.0 100.0
Ir_m1 0.0 0.0 100.0
Dr_m1 0.0 0.0 100.0
Dw_m1 0.0 0.0 100.0
Ir_mm 0.0 0.0 100.0
Dr_mm 0.0 0.0 100.0
Dw_mm 0.0 0.0 100.0
The ratio is somewhat better than these numbers give due to the overhead
in using API-test. The output of perf on most of the same data (thanks
to Sergey Aleynikov) is
char* foo = "1256908743";
blead
5,404,412,751 cycles:u
22,000,388,916 instructions:u # 4.07 insn per cycle
4,400,081,289 branches:u
5,097 branch-misses:u # 0.00% of all
branches
origin/smoke-me/khw-grok
5,404,110,883 cycles:u
20,100,389,643 instructions:u # 3.72 insn per cycle
3,200,081,354 branches:u
4,849 branch-misses:u # 0.00% of all
branches
char* foo = "124578902";
blead
4,942,080,391 cycles:u
20,300,388,876 instructions:u # 4.11 insn per cycle
4,000,081,249 branches:u
4,948 branch-misses:u # 0.00% of all
branches
origin/smoke-me/khw-grok
3,803,232,576 cycles:u
14,400,389,502 instructions:u # 3.79 insn per cycle
1,900,081,213 branches:u
4,557 branch-misses:u # 0.00% of all
branches
char* foo = "-124578902";
blead
4,903,991,977 cycles:u
20,300,388,874 instructions:u # 4.14 insn per cycle
4,000,081,247 branches:u
4,939 branch-misses:u # 0.00% of all
branches
origin/smoke-me/khw-grok
4,002,571,689 cycles:u
15,300,389,516 instructions:u # 3.82 insn per cycle
2,100,081,227 branches:u
4,381 branch-misses:u # 0.00% of all
branches
char* foo = "123foo";
blead
5,103,343,313 cycles:u
19,500,390,463 instructions:u # 3.82 insn per cycle
4,400,081,559 branches:u
5,429 branch-misses:u # 0.00% of all
branches
origin/smoke-me/khw-grok
4,704,290,235 cycles:u
18,600,391,167 instructions:u # 3.95 insn per cycle
3,700,081,585 branches:u
14,056 branch-misses:u # 0.00% of all
branches
-
Request for comments on commit129ccace6b45e3574c0b430b1fbcc7f8d0aa8e50, speed up grok_number
by Karl Williamson