develooper Front page | perl.perl5.porters | Postings from September 2003

optimizing /\s*,\s*/

Thread Next
From:
Jeff 'japhy' Pinyan
Date:
September 8, 2003 10:37
Subject:
optimizing /\s*,\s*/
Message ID:
Pine.LNX.4.44.0309081333400.4482-100000@perlmonk.org
A lot of times, when people want to split a string into comma-separated
fields, they use something like

  @fields = split /\s*,\s*/, $string;

Yes, naive, whatever, that's not the point.  The point is that the regex
engine matches \s*, and then looks for it to be followed by a comma.
Could the engine be optimized to search FIRST for the NON-OPTIONAL comma,
and then match all immediately preceding whitespace?  That is, on a string
like "abc  def , ghi,...", the engine would first find the , and then
subtract one from the beginning index of the match while the preceding
character is whitespace?

I'm not sure I know enough to implement this, but I'd think there'd be an
improvement, especially in cases where the optional piece (\s*) is found
frequently in the string.

-- 
Jeff "japhy" Pinyan      japhy@pobox.com      http://www.pobox.com/~japhy/
RPI Acacia brother #734   http://www.perlmonks.org/   http://www.cpan.org/
<stu> what does y/// stand for?  <tenderpuss> why, yansliterate of course.
[  I'm looking for programming work.  If you like my work, let me know.  ]


Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About