develooper Front page | perl.perl5.porters | Postings from May 2021

Re: Revisiting trim

Thread Previous | Thread Next
From:
=?UTF-8?Q?Andr=c3=a9_Warnier_=28tomcat/perl=29?=
Date:
May 28, 2021 21:05
Subject:
Re: Revisiting trim
Message ID:
e24771c7-8e80-8c54-8109-cb60ce459832@ice-sa.com
On 28.05.2021 22:33, Alberto Simões wrote:
> 
> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> On Friday, May 28th, 2021 at 21:30, David Nicol <davidnicol@gmail.com> wrote:
> 
>>
>>
>> On Fri, May 28, 2021 at 11:25 AM Joseph Brenner <doomvox@gmail.com 
>> <mailto:doomvox@gmail.com>> wrote:
>>
>>     André Warnier (tomcat/perl) <aw@ice-sa.com <mailto:aw@ice-sa.com>> wrote:
>>
>>     > $stripped_line =~ s/^\s+//; $stripped_line =~ /\s+$//; # or only one of those, depends
>>
>>     > Is /that/ the worst possible way ? or if not *the* worst, was there a better way
>>     all along ? (*)
>>
>>     That's a very reasonable way of doing it which may very well be the
>>     best way (though you dropped an "s" on the second "s///").
>>
>>     They were probably referring to a tendency of many programmers to
>>     obsess with trimming the left and right with a single s/// operation,
>>     which will result in a hairy, unreadable solution that won't peform
>>     any better than just doing it in two steps.
>>
>>
>> Is this really slowerr? Is this really hairier and less readable than the two step approach?
>>
>>      $reference_identifier =~ s/^\s*(.+?)\s*$/$1/;  # how I usually full-trim a 
>> reference identifier
>>
> 
> Probably still slower, but usually I write   $foo =~ s/^\s*|\s*$//g;
> 
> 
These will probably give a field day to whoever previously wrote about "the worst way 
possible" .. :-)

The perl regex engine is a wonderful thing, and my interpretation may be wrong, but I 
would tend to intuit that if you use captures (as in the first above) or alternatives (as 
in the second) or \s* (which may mean "nothing or something", in both), it is bound to be 
somewhat less efficient than if you give the regex engine something definite to look for, 
like "^\s+" and "\s+$" (although this one, if the target is utf8 and it works backward 
from the end, may be quite hairy too).
But whether that compensates for one assignment instead of two, I don't have a clue.

Anyway, it kind of makes the case for optimal (l|r|)trimmed() functions, to help us all 
poor mere perl programmers.

I'm interested in a guru comment though.


Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About