Front page | perl.perl5.porters |
Postings from May 2021
Re: Revisiting trim
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Friday, May 28, 2021 11:25 AM, Joseph Brenner <firstname.lastname@example.org> wrote:
> André Warnier (tomcat/perl) email@example.com wrote:
> > $stripped_line =~ s/^\s+//; $stripped_line =~ /\s+$//; # or only one of those, depends
> > Is /that/ the worst possible way ? or if not the worst, was there a better way all along ? (*)
> That's a very reasonable way of doing it which may very well be the
> best way (though you dropped an "s" on the second "s///").
> They were probably referring to a tendency of many programmers to
> obsess with trimming the left and right with a single s/// operation,
> which will result in a hairy, unreadable solution that won't peform
> any better than just doing it in two steps.
This is a good and generally applicable point for a lot of things; it smacks at the heart of "premature optimization if the root of all evil*"...
* except for 3% of the time when it's a trivial optimization
> I've no strong feelings on the "trim" discussion, but I think you
> argue well that the "rtrim" case is pretty common.
> I think tchrist probably has a point about the clarity of "trimmed",
> but I suspect if it'd been up to Larry Wall, he'd have gone with the
> shortest form. For some reason "trim", "trim('R')" and "trim('L')"
> seem perlish too me (though I gather "parameterization" is supposed to
> be off the table at this point, so an R/L argument would be
> controversial, too).
The length of the function is proportional to the frequency of use, this is the "Huffman encoding" aspect of "WWLD" (what would Larry do?).
Related to this discussion, that might no have been brought up; just for more context and information:
* 2 chars - uc [used primarily to normalize input from what I've seen]
* 7 chars - ucfirst [pretty sure I have *never* used this on purpose]
* 2 chars - lc [used same way generally as uc]
It's worth to note that they return the affected value and are non-destructive. But since 'trim' has been most often couched in terms of 'chomp', that is what's defining that whole part of the discussion.
> I see that in Raku, the routines are called "trim", "trim-leading" and
> "trim-trailing". (None of these trim in-place, to do that you'd use
> this idiom: "$line.=trim;").
> My apologies if it seems like we're re-opening old discussions at this
> point, but it's a problem in these debates that there's no easy way to
> review what's already been talked to death.
This horse is not dead. For me the most important aspect, as I've stated, is the precendent this can set (for good or ill) regarding but not limited to:
* a coherent and consistent strategy for DWIM string functions (which has been recognized by the PRC, tyvm)
* the question of *where* to put things (core vs CPAN/dual-life, namespaces, etc)
* and refining how "features" or "experiments" are handled wrt, among other things, backward and furture compatibilities (also seems to have been recognized by the PRC; again tyvm)
So this is not about 'trim'; it truly is what comes after. And since we have this opportunity now to take a step back, it's worth discussing. The issue of trim being efficatious is a part of this discussion; but not the "real" discussion IMO.