Front page | perl.perl5.porters |
Postings from May 2021
Re: Revisiting trim
May 28, 2021 19:31
Re: Revisiting trim
Message ID: firstname.lastname@example.org
On 28.05.2021 18:31, Dan Book wrote:
> My two cents on the parameterized trims:
> 1) trim-right and trim-left are certainly reasonable use cases, *however* they are not as
> common a need across CPAN and general code.
That's one way of looking at it.
I understand that you need a criterium to estimate the usefulness and/or appeal of a
proposed new keyword/function the language. But maybe counting how often it appears in a
(even large) set of code does not always tell the whole story ?
Another way would be to wonder at how often such code might be *executed*.
As a trivial and circumstancial example if I may :
Earlier this week I exported an SQL Server table of 157 million rows at 25 columns per
row, initially as a 14 GB CSV file. For reasons I shall not get into here, all the columns
came out as fixed-length, values right-appended with spaces. The ultimate goal was to
convert this to JSON, so to avoid a lot of unnecessary volume (JSON is already a lot more
verbose than CSV), I chose to individually right-trim every column in every CSV line
first. The program thus ran "s/\s+$//" 157 M x 25 = 3,175,000,000 times.
However, the "s/\s+$//" expression appears only once in the source of the program.
Understand that I am certainly not complaining about the efficiency of perl and
"s/\s+$//". They both did their job perfectly, and pretty fast too (close to the time it
took to just read that file with "wc -l", and much less time than it took to export the
CSV file in the first place).
But if a dedicated rtrimmed() function, in addition to being slightly more elegant, would
happen also to be 25% faster than the regex above, I wouldn't say no to it. I might even
write a perl program to look into all our data-intensive programs and flag all its