On Wed, 31 Mar 2021 at 08:59, <neilb@neilb.org> wrote: > Trimming is something that is frequently wanted, but though you > think it a no-brainer, people don=E2=80=99t always get it right. I found > about 7500 distributions with an "inline trim". Here are some of the > ones I found: > > s/(^\s+)|(\s+$)//; > s/(^\s+)|\n//gm; > s/(^\s+|\s+$)//g; > s/(^\s+)|(\s+$)//g; > s/(^\s+|\s+$)//os; > s/(^\s+|\s+$)//gs; > s/^\s*//; s/\s+$//; > > Not all of those work. There are any number of "gotcha" failures using regex trim on CPAN and even within Perl core modules. Further, at least two core module authors have duplicated "trim". A search for "trim string" on metacpan.org finds https://metacpan.org/pod/POOF: # trim leading and trailing white spaces $val =3D~ s/^\s*|\s*$//; This substitution will return true even if it matches nothing due to the asterisk, and it will fail to remove trailing whitespace if there is also leading whitespace due to the lack of a /g flag. $ perl -e 'my $g=3D" x ";$g=3D~s/^\s+|\s+$//;print "!$g!\n";' !x ! We can find many more examples of the "omitted /g" error on CPAN: https://grep.metacpan.org/search?q=3D%5CQs%2F%5E%5Cs%2B%7C%5Cs%2B%24%2F= %2F%5CE%5B%5Eg%5D*%24&qd=3D&qft=3D Using * instead of + after \s causes the substitution to always return a true value even if nothing changed. This is also fairly common: https://grep.metacpan.org/search?q=3D%5CQs%2F%5E%5Cs*%7C%5Cs*%24%2F%2F&= qd=3D&qft=3D It says "80 distributions". I looked through all of them but I didn't find anywhere where the return value of the substitution was being used, perhaps because that bug would have been caught quickly, except for here: https://grep.metacpan.org/search?qci=3D&q=3D%5CQs%2F%5E%5Cs*%7C%5Cs*%24= %2F%2F&qft=3D&qd=3DCohortExplorer where the programmer seems actually to be using the fact that it always returns a true value. Furthermore, there are several examples in Perl core modules. Mistaken use of the /s flag (make . match \n) to mean /m (make ^ and $ match new lines) is seen in such modules as Pod::Simple, CPAN::Module, Net::SMTP, Pod::Checker, Locale::Maketext, and I18N::LangTags. Mistaken use of s/^\s*// for trimming is seen in core modules like Win32, bigint.pm, and CPAN::Complete. I also found one example of s/^\s+|\s+$// (omitted /g flag means it fails to remove the end space from " this ") in the core modules, in ExtUtils::CBuilder::Platform::Windows: map {$a=3D$_;$a=3D~s/\t/ /g;$a=3D~s/^\s+|\s+$//;$a} Individual core modules which implement their own "trim" function include ExtUtils::ParseXS (trim_whitespace) and TAP::Parser (_trim).Thread Previous | Thread Next