develooper Front page | perl.perl5.porters | Postings from May 2021

Re: Revisiting trim

Thread Previous | Thread Next
From:
demerphq
Date:
May 29, 2021 08:01
Subject:
Re: Revisiting trim
Message ID:
CANgJU+XBA7hGXE=2OmfboUNVHqZ8oQCngK7Kxd6f5wpuqMB4Kg@mail.gmail.com
On Sat, 29 May 2021 at 09:56, demerphq <demerphq@gmail.com> wrote:
>
> On Sat, 29 May 2021 at 00:52, Joseph Brenner <doomvox@gmail.com> wrote:
> >
> > Some quick-and-dirty benchmarking, trimming 100,000 short strings:
> >
> > case 1:
> > $line =~ s/^\s+//;
> > $line =~ s/\s+$//;
> > # real  0m1.427s
> >
> > case 2:
> > $line =~ s/^\s*(.+?)\s*$/$1/;
> > # real  0m1.853s
> >
> > case 3:
> > $line =~ s/^\s*|\s*$//g;
> > # real  0m2.864s
> >
> > So, case 2 is 30% slower, case 3 is 100% slower.
> >
> > There's a simple fix that improves case 3 quite a bit:
> >
> > case 4:
> > $line =~ s/^\s+|\s+$//g;
> > # real  0m1.704s
> >
> > However: I took it very easy on this case using short lines... it's
> > very sensitive to line length (that \g is checking every point in the
> > string)  and it slows down by a factor of ten with lines that are only
> > around 80 chars long.
>
> THIS is the key point here. Run your benchmarks over strings of length
> 1, 10, 100, 1000, and include the examples I posted in another mail:
>
> 1 while $str=~s/\s\z//;
> chop($str) while $str=~m/\s\z/;
>
> Also do it on strings like this:

Here I meant to say do it on sequences of space/non-space with both
the space and non-space being longer and longer. Eg:

" ". (((" " x $l1) . ("Q" x $l2)) x $l3) . " ";

You will see that most of the regex versions degrade terribly. Im on
the wrong computer or id post the results, but I bet my hacks above
beat them all once the string gets over a certain size, if not hands
down.

Yves

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About