develooper Front page | perl.perl5.porters | Postings from June 2021

Benchmarking Pure Perl Trim Functions.

Thread Previous | Thread Next
From:
demerphq
Date:
June 1, 2021 06:59
Subject:
Benchmarking Pure Perl Trim Functions.
Message ID:
CANgJU+U9npxHCtSH+HzHNYTi=16cL6+tWpEbSVmkDBZV5V=pSw@mail.gmail.com
On Sat, 29 May 2021 at 09:56, demerphq <demerphq@gmail.com> wrote:
>
> On Sat, 29 May 2021 at 00:52, Joseph Brenner <doomvox@gmail.com> wrote:
> >
> > Some quick-and-dirty benchmarking, trimming 100,000 short strings:
...
> > However: I took it very easy on this case using short lines... it's
> > very sensitive to line length (that \g is checking every point in the
> > string)  and it slows down by a factor of ten with lines that are only
> > around 80 chars long.
>
> THIS is the key point here. Run your benchmarks over strings of length
> 1, 10, 100, 1000, and include the examples I posted in another mail:

I put the following together. It benchmarks various techniques for trimming
strings at different string lengths and composition.

$ cat trim.pl
use strict;
use warnings;
use Benchmark qw(cmpthese timethese :hireswallclock);
use Test::More;
my $base= 'naive';
my @extra= qw(separate loop loop_chop loop2);
my @keys= ($base,@extra);
$|++;
printf "%4s %3s %3s %10s|%10s|".("%10s %10s|" x @extra)."\n",
    "reps","ns","sp","len", $base, map { $_ , "as pct" } @extra;
printf "%.4s-%.3s-%.3s-%.10s+%.10s+".("%.10s-%.10s+" x @extra)."\n",
    ("-" x 10) x (5+2*@extra);
foreach my $segments (1,10,100,1000) {
    foreach my $ns_len (1,2,5,10,25,100) {
        foreach my $sp_len (1,2,5,10,25,100) {
            my $descr= "segments: $segments non-space length: $ns_len space
length: $sp_len";
            my $string= (" " x $sp_len) . ((("x" x $ns_len) . (" " x
$sp_len))x$segments);
            my ($naive,$separate,$loop,$loop2,$loop_chop);
            #diag "timing $descr\n";
            my $str_len= length $string;
            my $r= timethese -1, {
                naive => sub { $naive= $string; $naive=~s/\A\s+|\s+\z//g;
 },
                separate => sub { $separate= $string; $separate=~s/\A\s+//;
$separate=~s/\s+\z//; },
                loop => sub { $loop= $string; $loop=~s/\A\s+//; 1 while
$loop=~s/\s\z//; },
                loop2 => sub {
                    $loop2= $string;
                    $loop2=~s/\A\s+//;
                    1 while
                    $loop2=~s/\s{16}\z//;
                    $loop2=~s/\s{8}\z//;
                    $loop2=~s/\s{4}\z//;
                    $loop2=~s/\s{2}\z//;
                    $loop2=~s/\s{1}\z//;
                },
                loop_chop => sub {
                    $loop_chop= $string;
                    $loop_chop=~s/\A\s+//;
                    chop($loop_chop) while $loop_chop=~m/\s\z/;
                },
            }, "none";
            my @key= ($segments,$ns_len,$sp_len,$str_len);
            my %rps;
            my $max;
            my $max_name;
            foreach my $name (@keys) {
                $rps{$name}= $r->{$name}->iters/$r->{$name}->real;
                if (!defined $max or $max < $rps{$name}) {
                    $max= $rps{$name};
                    $max_name= $name;
                }
            }
            my @data;
            foreach my $name (@keys) {
                my $fmt= $name eq $max_name ? "+" : "";
                push @data, sprintf "%$fmt.1f",$rps{$name};
                push @data, sprintf "%$fmt.1f",$rps{$name}/$rps{naive}*100
                    if $name ne "naive";
            }
            printf "%4d %3d %3d %10d|%10s|". ("%10s %10s|" x @extra) .
"\n", @key, @data;
            $ENV{CHECK} and ok(
                $naive &&
                $naive eq $separate &&
                $naive eq $loop &&
                $naive eq $loop2 &&
                $naive eq $loop_chop, "all results the same - $descr"
            );
        }
    }
}

It produces the following report. The left side shows the composition of
the string, and its length. The next column is timing for the naive
s/\A\s+|\s+\z//g approach. The following four columns contain data for each
other comparison strategy. "separate" is standard recommended "less worse
way" of using two s/\A\s+// and s/\s+\z//. The rest are solutions I ginned
up knowing how the  regex engine works to optimize the separate strategy.
The strategy that is the +best+  for any row is prefixed with a + sign.


reps  ns  sp        len|     naive|  separate     as pct|      loop     as
pct| loop_chop     as pct|     loop2     as pct|
-----------------------+----------+---------------------+---------------------+---------------------+---------------------+
   1   1   1          3|+1757705.9| 1722645.2       98.0| 1346754.4
76.6| 1460137.6       83.1| 1234795.8       70.3|
   1   1   2          5|+1767954.3| 1727402.3       97.7|  938354.5
53.1| 1069422.3       60.5| 1220935.1       69.1|
   1   1   5         11|+1795476.6| 1728537.8       96.3|  500068.3
27.9|  594498.6       33.1|  914134.5       50.9|
   1   1  10         21|+1802973.3| 1668560.7       92.5|  276854.3
15.4|  344098.8       19.1|  902688.3       50.1|
   1   1  25         51|+1720894.0| 1547221.8       89.9|  119435.9
 6.9|  150625.4        8.8|  685213.9       39.8|
   1   1 100        201|+1418034.7| 1302023.4       91.8|   31939.3
 2.3|   41388.1        2.9|  340764.2       24.0|
   1   2   1          4| 1575700.7|+1695328.2     +107.6| 1336772.1
84.8| 1467319.5       93.1| 1230853.2       78.1|
   1   2   2          6| 1581974.8|+1705458.9     +107.8|  928325.3
58.7| 1067259.2       67.5| 1058661.8       66.9|
   1   2   5         12| 1624738.2|+1724010.4     +106.1|  499957.4
30.8|  573393.5       35.3|  922924.4       56.8|
   1   2  10         22| 1628322.0|+1645636.7     +101.1|  276198.2
17.0|  342247.1       21.0|  821484.7       50.4|
   1   2  25         52|+1549776.4| 1538065.2       99.2|  119481.1
 7.7|  148029.2        9.6|  691068.5       44.6|
   1   2 100        202|+1306857.0| 1301676.6       99.6|   31817.4
 2.4|   41587.5        3.2|  323718.2       24.8|
   1   5   1          7| 1185669.6|+1691598.2     +142.7| 1330096.1
 112.2| 1452426.8      122.5| 1097320.8       92.5|
   1   5   2          9| 1453834.2|+2041745.4     +140.4| 1111872.3
76.5| 1140493.7       78.4| 1209036.7       83.2|
   1   5   5         15| 1115944.2|+1646570.6     +147.5|  493371.7
44.2|  567321.4       50.8|  824512.2       73.9|
   1   5  10         25| 1206617.8|+1643729.1     +136.2|  279888.0
23.2|  346755.6       28.7|  845376.3       70.1|
   1   5  25         55| 1184689.4|+1510205.0     +127.5|  118686.9
10.0|  149178.5       12.6|  616426.3       52.0|
   1   5 100        205| 1039192.6|+1270575.0     +122.3|   31737.8
 3.1|   41495.2        4.0|  316226.2       30.4|
   1  10   1         12|  820120.8|+1639591.4     +199.9| 1318749.6
 160.8| 1555006.0      189.6| 1005052.9      122.5|
   1  10   2         14|  851612.2|+1619177.8     +190.1|  927353.2
 108.9| 1067711.2      125.4|  979289.7      115.0|
   1  10   5         20|  850770.4|+1573741.5     +185.0|  501548.6
59.0|  596597.4       70.1|  821646.5       96.6|
   1  10  10         30|  861287.6|+1595439.1     +185.2|  275914.5
32.0|  345409.0       40.1|  777486.8       90.3|
   1  10  25         60|  837027.1|+1491381.2     +178.2|  119355.4
14.3|  150298.2       18.0|  586256.7       70.0|
   1  10 100        210|  772620.9|+1251234.9     +161.9|   31937.0
 4.1|   41554.4        5.4|  316189.0       40.9|
   1  25   1         27|  458169.3| 1503483.5      328.2| 1298094.3
 283.3|+1539391.2     +336.0|  903508.6      197.2|
   1  25   2         29|  458989.2|+1490671.5     +324.8|  915198.1
 199.4| 1034142.0      225.3|  861390.4      187.7|
   1  25   5         35|  458773.1|+1471058.8     +320.7|  501442.3
 109.3|  587883.6      128.1|  743133.4      162.0|
   1  25  10         45|  462414.3|+1479716.8     +320.0|  283057.4
61.2|  348896.5       75.5|  765866.2      165.6|
   1  25  25         75|  457351.6|+1429907.3     +312.6|  121225.8
26.5|  156807.2       34.3|  588809.1      128.7|
   1  25 100        225|  432971.5|+1175534.2     +271.5|   32177.0
 7.4|   41895.5        9.7|  301853.6       69.7|
   1 100   1        102|  144919.5| 1130526.3      780.1| 1288674.2
 889.2|+1543481.1    +1065.1|  892586.8      615.9|
   1 100   2        104|  145676.2|+1161293.2     +797.2|  904949.9
 621.2| 1039056.1      713.3|  860907.1      591.0|
   1 100   5        110|  144050.6|+1127321.0     +782.6|  493589.1
 342.6|  591349.8      410.5|  740179.2      513.8|
   1 100  10        120|  146268.0|+1129848.2     +772.5|  280432.7
 191.7|  348159.2      238.0|  765258.2      523.2|
   1 100  25        150|  143834.8|+1056520.3     +734.5|  118337.4
82.3|  155110.3      107.8|  567786.9      394.7|
   1 100 100        300|  143952.8| +922860.2     +641.1|   32202.6
22.4|   41905.6       29.1|  302929.3      210.4|
  10   1   1         21|  528782.4|  893183.8      168.9| 1321008.8
 249.8|+1553550.9     +293.8|  913725.5      172.8|
  10   1   2         32|  394258.6|  852738.2      216.3|  913856.5
 231.8|+1050503.7     +266.5|  859744.2      218.1|
  10   1   5         65|  215308.3| +787900.3     +365.9|  494882.7
 229.8|  603500.5      280.3|  742474.9      344.8|
  10   1  10        120|  124251.0|  700292.8      563.6|  281195.7
 226.3|  351778.7      283.1| +734899.3     +591.5|
  10   1  25        285|   45218.1|  519323.5     1148.5|  122100.1
 270.0|  155246.9      343.3| +562583.7    +1244.2|
  10   1 100       1110|    9388.1|  223043.9     2375.8|   30162.0
 321.3|   38781.0      413.1| +281476.3    +2998.2|
  10   2   1         31|  424109.1|  869318.4      205.0| 1299544.4
 306.4|+1533392.1     +361.6|  882680.4      208.1|
  10   2   2         42|  307274.3|  839474.7      273.2|  917371.6
 298.6|+1051529.4     +342.2|  863403.8      281.0|
  10   2   5         75|  185983.8| +775564.3     +417.0|  497265.3
 267.4|  603835.5      324.7|  743419.4      399.7|
  10   2  10        130|  112609.7|  675451.1      599.8|  278571.4
 247.4|  348152.3      309.2| +735167.5     +652.8|
  10   2  25        295|   45285.2|  518512.4     1145.0|  119989.6
 265.0|  151824.4      335.3| +579544.6    +1279.8|
  10   2 100       1120|    9341.7|  221246.0     2368.4|   30107.0
 322.3|   37328.6      399.6| +283455.4    +3034.3|
  10   5   1         61|  224251.4|  818246.7      364.9| 1298200.5
 578.9|+1530159.2     +682.3|  904004.8      403.1|
  10   5   2         72|  196735.0|  790434.4      401.8|  846229.2
 430.1|+1031049.9     +524.1|  780884.3      396.9|
  10   5   5        105|  138580.8|  732656.0      528.7|  492956.3
 355.7|  601118.7      433.8| +742693.5     +535.9|
  10   5  10        160|   93825.9|  628881.6      670.3|  280007.8
 298.4|  354010.7      377.3| +736628.3     +785.1|
  10   5  25        325|   42383.1|  494662.2     1167.1|  117590.2
 277.4|  147048.1      346.9| +557730.6    +1315.9|
  10   5 100       1150|    9058.8|  219280.3     2420.6|   28384.1
 313.3|   35916.9      396.5| +279448.2    +3084.8|
  10  10   1        111|  137779.9|  720965.4      523.3| 1235600.9
 896.8|+1486771.6    +1079.1|  898092.1      651.8|
  10  10   2        122|  123282.4|  709133.9      575.2|  893166.0
 724.5| +987447.1     +801.0|  838158.1      679.9|
  10  10   5        155|   96949.7|  661054.1      681.9|  489928.3
 505.3|  581166.4      599.5| +716649.8     +739.2|
  10  10  10        210|   72653.1|  605407.5      833.3|  274979.8
 378.5|  348327.4      479.4| +731943.1    +1007.4|
  10  10  25        375|   37519.4|  466174.1     1242.5|  117381.8
 312.9|  149268.3      397.8| +559221.6    +1490.5|
  10  10 100       1200|    8852.0|  214408.8     2422.2|   23658.6
 267.3|   28419.6      321.1| +246862.1    +2788.8|
  10  25   1        261|   58768.2|  553643.6      942.1| 1258583.8
2141.6|+1462758.0    +2489.0|  845871.3     1439.3|
  10  25   2        272|   57826.9|  549103.2      949.6|  849987.3
1469.9|+1003831.8    +1735.9|  815206.9     1409.7|
  10  25   5        305|   51374.5|  517744.5     1007.8|  468698.6
 912.3|  553291.0     1077.0| +704154.2    +1370.6|
  10  25  10        360|   43997.3|  488085.7     1109.4|  271530.1
 617.2|  335804.3      763.2| +722621.8    +1642.4|
  10  25  25        525|   27927.7|  392185.6     1404.3|  118022.6
 422.6|  147582.1      528.4| +565493.7    +2024.8|
  10  25 100       1350|    8153.8|  190852.5     2340.7|   22239.0
 272.7|   26769.3      328.3| +233509.2    +2863.8|
  10 100   1       1011|   15695.9|  259241.0     1651.6| 1077068.5
6862.1|+1338756.8    +8529.3|  822100.9     5237.7|
  10 100   2       1022|   15902.9|  252625.0     1588.6|  786361.5
4944.8| +885905.0    +5570.7|  807424.6     5077.2|
  10 100   5       1055|   15313.0|  247028.6     1613.2|  368047.8
2403.5|  422452.3     2758.8| +657416.6    +4293.2|
  10 100  10       1110|   14745.9|  241046.0     1634.7|  204743.0
1388.5|  239400.3     1623.5| +664549.1    +4506.7|
  10 100  25       1275|   12245.3|  215672.4     1761.3|   91137.3
 744.3|  101068.3      825.4| +482580.0    +3940.9|
  10 100 100       2100|    5973.3|  140923.5     2359.2|   22359.4
 374.3|   25842.7      432.6| +230186.3    +3853.6|
 100   1   1        201|   71160.9|  165570.8      232.7| 1209681.7
1699.9|+1474622.5    +2072.2|  866177.7     1217.2|
 100   1   2        302|   47319.1|  157226.6      332.3|  849791.6
1795.9| +990606.1    +2093.5|  825137.9     1743.8|
 100   1   5        605|   22567.3|  136572.5      605.2|  465726.6
2063.7|  554461.5     2456.9| +699124.2    +3098.0|
 100   1  10       1110|   12241.0|  111043.3      907.1|  200570.1
1638.5|  233323.9     1906.1| +642948.1    +5252.4|
 100   1  25       2625|    4464.0|   72398.5     1621.8|   82920.7
1857.6|   97895.8     2193.0| +457639.0   +10251.8|
 100   1 100      10200|     860.1|   24847.9     2889.0|   18152.2
2110.5|   18367.4     2135.5| +180327.4   +20966.2|
 100   2   1        301|   46815.9|  160696.6      343.3| 1260595.5
2692.7|+1455222.9    +3108.4|  863145.8     1843.7|
 100   2   2        402|   35447.5|  150768.2      425.3|  872033.3
2460.1|+1006330.9    +2838.9|  827931.8     2335.7|
 100   2   5        705|   19587.2|  131168.8      669.7|  464567.6
2371.8|  545684.9     2785.9| +688526.5    +3515.2|
 100   2  10       1210|   11037.0|  108582.0      983.8|  215955.9
1956.7|  240021.9     2174.7| +638698.9    +5786.9|
 100   2  25       2725|    4232.9|   71036.1     1678.2|   82735.3
1954.6|   98670.5     2331.0| +449662.3   +10623.1|
 100   2 100      10300|     851.9|   24643.1     2892.6|   17611.6
2067.2|   18479.8     2169.2| +175928.2   +20650.4|
 100   5   1        601|   25265.2|  144441.8      571.7| 1163740.5
4606.1|+1436733.4    +5686.6|  843670.5     3339.3|
 100   5   2        702|   21464.1|  135706.9      632.3|  833619.4
3883.8| +945707.9    +4406.0|  787369.9     3668.3|
 100   5   5       1005|   14347.0|  119614.7      833.7|  428651.5
2987.7|  523296.4     3647.4| +682096.7    +4754.3|
 100   5  10       1510|    9354.5|   99791.9     1066.8|  201244.3
2151.3|  235072.9     2512.9| +637732.7    +6817.4|
 100   5  25       3025|    3786.6|   67911.3     1793.5|   80066.9
2114.5|   93937.6     2480.8| +437053.6   +11542.1|
 100   5 100      10600|     827.4|   24141.8     2917.7|   17014.4
2056.3|   19307.6     2333.4| +173173.4   +20929.0|
 100  10   1       1101|   14297.8|  120438.9      842.4| 1023704.6
7159.9|+1165436.1    +8151.1|  810992.0     5672.1|
 100  10   2       1202|   13069.0|  115188.5      881.4|  707840.9
5416.2| +813981.7    +6228.4|  763229.5     5840.0|
 100  10   5       1505|   10118.7|  102630.8     1014.3|  356061.3
3518.9|  387553.8     3830.1| +641697.9    +6341.7|
 100  10  10       2010|    7321.0|   89036.4     1216.2|  199950.9
2731.2|  223099.1     3047.4| +633412.2    +8652.0|
 100  10  25       3525|    3549.8|   61546.7     1733.8|   79029.9
2226.3|   90370.9     2545.8| +429170.4   +12090.0|
 100  10 100      11100|     748.7|   23472.5     3134.9|   15912.5
2125.2|   19193.9     2563.5| +170810.7   +22812.9|
 100  25   1       2601|    6134.5|   83747.1     1365.2| 1002307.0
 16338.7|+1048943.8   +17099.0|  773842.2    12614.5|
 100  25   2       2702|    6022.5|   80812.2     1341.8|  690373.8
 11463.3| +789307.9   +13106.0|  734616.4    12197.9|
 100  25   5       3005|    5290.0|   74144.5     1401.6|  347291.3
6565.1|  388696.0     7347.8| +590149.9   +11156.0|
 100  25  10       3510|    4377.8|   67076.1     1532.2|  186616.1
4262.8|  217718.5     4973.2| +594861.4   +13588.1|
 100  25  25       5025|    2417.7|   49032.7     2028.1|   67341.0
2785.3|   84457.0     3493.3| +404227.5   +16719.4|
 100  25 100      12600|     760.2|   20921.5     2752.0|   14262.1
1876.0|   18742.6     2465.4| +150987.3   +19860.9|
 100 100   1      10101|    1540.9|   31347.3     2034.4|  686419.2
 44547.7| +812736.2   +52745.5|  563080.5    36543.2|
 100 100   2      10202|    1599.6|   30255.6     1891.4|  512547.8
 32041.4|  568231.7    35522.4| +573178.2   +35831.7|
 100 100   5      10505|    1556.1|   30008.5     1928.4|  246167.4
 15819.0|  295721.0    19003.4| +438022.6   +28147.9|
 100 100  10      11010|    1473.0|   28154.9     1911.3|  123781.7
8403.1|  158667.0    10771.4| +422046.3   +28651.3|
 100 100  25      12525|    1214.5|   24826.4     2044.1|   57454.1
4730.6|   53768.6     4427.1| +285759.8   +23528.5|
 100 100 100      20100|     562.8|   14887.9     2645.4|    9575.2
1701.4|   10747.1     1909.6| +112829.4   +20048.4|
1000   1   1       2001|    7569.6|   18967.7      250.6| 1000978.7
 13223.6|+1080332.6   +14271.9|  771630.1    10193.7|
1000   1   2       3002|    4826.5|   17037.4      353.0|  666101.8
 13800.8| +757023.3   +15684.6|  724266.7    15005.9|
1000   1   5       6005|    2336.2|   15062.9      644.8|  303254.6
 12980.7|  334139.1    14302.7| +538101.4   +23033.2|
1000   1  10      11010|    1220.0|   11970.2      981.2|  138316.2
 11337.5|  148179.3    12145.9| +408164.4   +33456.3|
1000   1  25      26025|     443.7|    7623.2     1718.3|   34113.0
7689.1|   30913.0     6967.8| +177876.3   +40093.4|
1000   1 100     101100|      84.3|    2468.9     2928.5|    2571.5
3050.1|    2859.8     3392.1|  +13884.3   +16468.7|
1000   2   1       3001|    4905.2|   18267.7      372.4|  993230.1
 20248.7|+1029848.9   +20995.2|  759556.4    15484.8|
1000   2   2       4002|    3555.7|   16763.4      471.4|  629670.4
 17708.6| +690171.9   +19410.1|  676033.7    19012.5|
1000   2   5       7005|    1991.1|   14572.6      731.9|  301237.7
 15129.0|  307670.9    15452.1| +490246.8   +24621.6|
1000   2  10      12010|    1108.0|   11563.9     1043.7|  126427.0
 11410.7|  136701.5    12338.0| +414599.8   +37419.7|
1000   2  25      27025|     432.4|    7487.6     1731.8|   33016.2
7636.3|   33746.7     7805.3| +179156.7   +41437.1|
1000   2 100     102100|      83.4|    2461.1     2950.2|    2452.5
2939.8|    2422.0     2903.2|  +13665.8   +16381.2|
1000   5   1       6001|    2582.6|   16092.1      623.1|  834306.9
 32304.6| +893998.4   +34615.9|  684801.2    26515.7|
1000   5   2       7002|    2190.9|   14064.5      642.0|  566604.2
 25862.0|  605906.5    27655.9| +622503.9   +28413.5|
1000   5   5      10005|    1468.3|   13074.6      890.5|  258037.6
 17574.3|  308088.0    20983.1| +444225.3   +30255.1|
1000   5  10      15010|     929.6|   10634.7     1144.0|  114688.9
 12337.4|  127073.2    13669.6| +347484.4   +37379.7|
1000   5  25      30025|     402.6|    7036.0     1747.5|   29048.0
7214.5|   30214.3     7504.2| +162313.3   +40313.1|
1000   5 100     105100|      82.3|    2402.8     2919.4|    2360.9
2868.4|    2589.4     3146.1|  +12952.8   +15737.6|
1000  10   1      11001|    1445.0|   13367.3      925.1| +683761.0
+47318.3|  675209.4    46726.5|  562329.9    38914.9|
1000  10   2      12002|    1309.7|   12439.2      949.7|  483623.4
 36925.1|  468215.2    35748.7| +493636.6   +37689.6|
1000  10   5      15005|    1020.7|   11027.6     1080.4|  202199.3
 19809.6|  213994.4    20965.1| +343494.6   +33652.3|
1000  10  10      20010|     709.5|    8101.9     1142.0|   84429.2
 11900.5|   89562.1    12624.0| +298947.9   +42137.5|
1000  10  25      35025|     359.5|    6504.4     1809.3|   23864.3
6638.3|   26191.8     7285.7| +148658.8   +41352.0|
1000  10 100     110100|      77.4|    2339.1     3023.4|    2384.8
3082.5|    2237.6     2892.2|  +12432.5   +16069.4|
1000  25   1      26001|     623.4|    8975.4     1439.7| +414136.0
+66428.3|  392230.0    62914.6|  331792.3    53220.2|
1000  25   2      27002|     602.3|    8387.7     1392.6|  269179.2
 44690.8|  273939.6    45481.1| +324980.3   +53955.2|
1000  25   5      30005|     532.8|    7839.3     1471.5|  126420.4
 23729.5|  135093.8    25357.5| +221663.1   +41606.8|
1000  25  10      35010|     440.1|    6948.5     1578.8|   63235.6
 14367.5|   64159.0    14577.4| +202311.2   +45966.5|
1000  25  25      50025|     271.1|    5203.6     1919.4|   21312.3
7861.3|   21140.1     7797.8| +110040.4   +40589.7|
1000  25 100     125100|      75.1|    2111.2     2811.5|    1952.8
2600.5|    2060.8     2744.4|  +12081.9   +16089.5|
1000 100   1     101001|     162.0|    3203.9     1977.3|  116682.2
 72011.6| +125619.4   +77527.2|  118859.9    73355.6|
1000 100   2     102002|     159.8|    3077.3     1925.9|   79104.9
 49505.6|   85925.3    53773.9| +120926.8   +75678.6|
1000 100   5     105005|     156.6|    3012.6     1924.3|   40502.8
 25871.7|   39483.2    25220.4|  +75288.6   +48091.5|
1000 100  10     110010|     146.2|    2839.9     1942.2|   22170.3
 15161.9|   23012.8    15738.1|  +70169.8   +47988.0|
1000 100  25     125025|     122.0|    2513.4     2059.8|    7832.6
6419.0|    8531.2     6991.5|  +44494.1   +36463.8|
1000 100 100     200100|      55.9|    1501.8     2687.2|    1260.1
2254.7|    1274.9     2281.3|   +6722.1   +12028.3|

What you can see is that the naive strategy is only the best strategy when
the string is very short and has few space/non-space sequences.  As the
string length rises the strategies that avoid doing \s+ and instead remove
a fixed number of whitespace characters from the end of the string are
actually faster, even if they have a high overhead.  You can see that
effectively the speed of the "loop" strategies are dominated by the number
of spaces at the right and run independently of the length of the string
(loop and loop_chop are heavily dependent on the number of spaces on the
right, loop2 is smarter and reduces that cost significantly). Both the
naive and separate approaches are run-time proportional to the length of
the string, for naive the use of an alternation provides terrible
performance for sequences of space/non-space, and the s/\s+\z// forces the
"separate" function to be run time proportional to the length as well. What
all this shows is that a properly implemented trim/trim function in XS
would be *massively* faster than all of this.It would be able to do a very
low overhead walk from the right to ensure that the performance is
completely dominated by the number of spaces it needs to remove which I
guess would be as efficient it can be made to be.

I note that if you consider rtrimming sentences, most fo the sentences in
this mail would be of a type and structure that most users would benefit
from using the loop2 strategy, not the commonly recommended separate
strategy.

Which just proves the point, most people, even the experts like myself and
people on this list do not implement rtrim efficiently. In fact, it's quite
hard to do so.

cheers,
Yves

-- 
perl -Mre=debug -e "/just|another|perl|hacker/"

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About