develooper Front page | perl.perl5.porters | Postings from September 2003

Re: optimizing /\s*,\s*/

Thread Previous | Thread Next
From:
H.Merijn Brand
Date:
September 8, 2003 11:03
Subject:
Re: optimizing /\s*,\s*/
Message ID:
20030908200211.AD7A.H.M.BRAND@hccnet.nl
On Mon 08 Sep 2003 19:36, "Jeff 'japhy' Pinyan" <japhy@perlmonk.org> wrote:
> A lot of times, when people want to split a string into comma-separated
> fields, they use something like
> 
>   @fields = split /\s*,\s*/, $string;

Yes, I do a lot of split m/\s*\|\s*/, $_, -1
What's naive? It's probably an 'Ahh, sure. Darn I forgot.' but enlighten me.

> Yes, naive, whatever, that's not the point.  The point is that the regex
> engine matches \s*, and then looks for it to be followed by a comma.
> Could the engine be optimized to search FIRST for the NON-OPTIONAL comma,
> and then match all immediately preceding whitespace?  That is, on a string
> like "abc  def , ghi,...", the engine would first find the , and then
> subtract one from the beginning index of the match while the preceding
> character is whitespace?
> 
> I'm not sure I know enough to implement this, but I'd think there'd be an
> improvement, especially in cases where the optional piece (\s*) is found
> frequently in the string.

-- 
H.Merijn Brand        Amsterdam Perl Mongers (http://amsterdam.pm.org/)
using perl-5.6.1, 5.8.0, & 5.9.x, and 806 on  HP-UX 10.20 & 11.00, 11i,
   AIX 4.3, SuSE 8.2, and Win2k.           http://www.cmve.net/~merijn/
http://archives.develooper.com/daily-build@perl.org/   perl-qa@perl.org
send smoke reports to: smokers-reports@perl.org, QA: http://qa.perl.org



Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About