Excellent research Mr. Bullock! You bring up a very good point Grinnz came across <https://www.reddit.com/r/perl/comments/hf3jlx/announcing_perl_7/fvwp1zt/?utm_source=reddit&utm_medium=web2x&context=3> when we initially started implementing trim(). One of the main reasons we want it done in core is because it's implemented so many times in other places, and *often* implemented *incorrectly*. Putting it in core we can implement it correctly and stop developers from having to reinvent the wheel. In your research did you find out if people implementing trim() as a sub do it in-place or as a return value? That seems to be hotly debated right now. - Scott On 3/31/21 6:01 AM, Ben Bullock wrote: > On Wed, 31 Mar 2021 at 08:59, <neilb@neilb.org> wrote: > >> Trimming is something that is frequently wanted, but though you >> think it a no-brainer, people don=E2=80=99t always get it right. I found >> about 7500 distributions with an "inline trim". Here are some of the >> ones I found: >> >> s/(^\s+)|(\s+$)//; >> s/(^\s+)|\n//gm; >> s/(^\s+|\s+$)//g; >> s/(^\s+)|(\s+$)//g; >> s/(^\s+|\s+$)//os; >> s/(^\s+|\s+$)//gs; >> s/^\s*//; s/\s+$//; >> >> Not all of those work. > There are any number of "gotcha" failures using regex trim on CPAN and > even within Perl core modules. Further, at least two core module > authors have duplicated "trim". > > A search for "trim string" on metacpan.org finds > https://metacpan.org/pod/POOF: > > # trim leading and trailing white spaces > $val =3D~ s/^\s*|\s*$//; > > This substitution will return true even if it matches nothing due to > the asterisk, and it will fail to remove trailing whitespace if there > is also leading whitespace due to the lack of a /g flag. > > $ perl -e 'my $g=3D" x ";$g=3D~s/^\s+|\s+$//;print "!$g!\n";' > !x ! > > We can find many more examples of the "omitted /g" error on CPAN: > > https://grep.metacpan.org/search?q=3D%5CQs%2F%5E%5Cs%2B%7C%5Cs%2B%24%2F= > %2F%5CE%5B%5Eg%5D*%24&qd=3D&qft=3D > > Using * instead of + after \s causes the substitution to always return > a true value even if nothing changed. This is also fairly common: > > https://grep.metacpan.org/search?q=3D%5CQs%2F%5E%5Cs*%7C%5Cs*%24%2F%2F&= > qd=3D&qft=3D > > It says "80 distributions". I looked through all of them but I didn't > find anywhere where the return value of the substitution was being > used, perhaps because that bug would have been caught quickly, except > for here: > > https://grep.metacpan.org/search?qci=3D&q=3D%5CQs%2F%5E%5Cs*%7C%5Cs*%24= > %2F%2F&qft=3D&qd=3DCohortExplorer > > where the programmer seems actually to be using the fact that it > always returns a true value. > > Furthermore, there are several examples in Perl core modules. > > Mistaken use of the /s flag (make . match \n) to mean /m (make ^ and $ > match new lines) is seen in such modules as Pod::Simple, CPAN::Module, > Net::SMTP, Pod::Checker, Locale::Maketext, and I18N::LangTags. > > Mistaken use of s/^\s*// for trimming is seen in core modules like > Win32, bigint.pm, and CPAN::Complete. > > I also found one example of s/^\s+|\s+$// (omitted /g flag means it > fails to remove the end space from " this ") in the core modules, in > ExtUtils::CBuilder::Platform::Windows: > > map {$a=3D$_;$a=3D~s/\t/ /g;$a=3D~s/^\s+|\s+$//;$a} > > Individual core modules which implement their own "trim" function > include ExtUtils::ParseXS (trim_whitespace) and TAP::Parser (_trim). >Thread Previous | Thread Next