develooper Front page | perl.perl5.porters | Postings from May 2021

Re: Revisiting trim

Thread Previous | Thread Next
From:
Joseph Brenner
Date:
May 28, 2021 22:52
Subject:
Re: Revisiting trim
Message ID:
CAFfgvXV179+YLmNhuPN9_TY3G5KsvbCY87iJEinUKKc8QiNCmg@mail.gmail.com
Some quick-and-dirty benchmarking, trimming 100,000 short strings:

case 1:
$line =~ s/^\s+//;
$line =~ s/\s+$//;
# real	0m1.427s

case 2:
$line =~ s/^\s*(.+?)\s*$/$1/;
# real	0m1.853s

case 3:
$line =~ s/^\s*|\s*$//g;
# real	0m2.864s

So, case 2 is 30% slower, case 3 is 100% slower.

There's a simple fix that improves case 3 quite a bit:

case 4:
$line =~ s/^\s+|\s+$//g;
# real	0m1.704s

However: I took it very easy on this case using short lines... it's
very sensitive to line length (that \g is checking every point in the
string)  and it slows down by a factor of ten with lines that are only
around 80 chars long.

Anyway, these speed penalties are Not Good, but they're also not
(usually) a reason to care.
Granted I was exaggerating calling these hairy and
unreadable, but I think they're all harder to read.

(For example, with "case 3", my first thought was it was
broken and wouldn't strip trailing whitespace if it
had stripped leading whitespace, but then I noticed the /g.
And further, it's using a * instead of a +, so without the /g
it *never* strips trailing space: so there were two things
I didn't understand.)

The thing you should ask yourself as a perl programmer is
"what did I think I would gain from doing this in one
line?".

The key point for the perl5-porters though is that there
is indeed a need for a built-in trim.

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About