develooper Front page | perl.perl5.porters | Postings from September 2003

Re: optimizing /\s*,\s*/

Thread Previous | Thread Next
H.Merijn Brand
September 8, 2003 11:03
Re: optimizing /\s*,\s*/
Message ID:
On Mon 08 Sep 2003 19:36, "Jeff 'japhy' Pinyan" <> wrote:
> A lot of times, when people want to split a string into comma-separated
> fields, they use something like
>   @fields = split /\s*,\s*/, $string;

Yes, I do a lot of split m/\s*\|\s*/, $_, -1
What's naive? It's probably an 'Ahh, sure. Darn I forgot.' but enlighten me.

> Yes, naive, whatever, that's not the point.  The point is that the regex
> engine matches \s*, and then looks for it to be followed by a comma.
> Could the engine be optimized to search FIRST for the NON-OPTIONAL comma,
> and then match all immediately preceding whitespace?  That is, on a string
> like "abc  def , ghi,...", the engine would first find the , and then
> subtract one from the beginning index of the match while the preceding
> character is whitespace?
> I'm not sure I know enough to implement this, but I'd think there'd be an
> improvement, especially in cases where the optional piece (\s*) is found
> frequently in the string.

H.Merijn Brand        Amsterdam Perl Mongers (
using perl-5.6.1, 5.8.0, & 5.9.x, and 806 on  HP-UX 10.20 & 11.00, 11i,
   AIX 4.3, SuSE 8.2, and Win2k. 
send smoke reports to:, QA:

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About