develooper Front page | perl.beginners | Postings from August 2009

Re: Regular expression help

Thread Previous | Thread Next
From:
Chas. Owens
Date:
August 26, 2009 16:02
Subject:
Re: Regular expression help
Message ID:
58ce48dc0908261601v5dd4d019rf2d9addbca0e7ace@mail.gmail.com
On Wed, Aug 26, 2009 at 03:46, Dave Tang<d.tang@imb.uq.edu.au> wrote:
snip
>> for my $token ($line =~ /([,"]|[^,"]+)/g) {
>
> I changed the single pipe (|) to double pipes (||) and $token also contained
> empty strings. Could you explain the difference between the pipes?
snip

The pipe character in regexes creates an alternation.  An alternation
matches if one of the expressions matches.  By adding another pipe,
you told the regex engine that it should match [,"] or nothing or
[^,"].  The empty strings you saw were the nothings being matched.  I
can only assume you changed it to || out of some mistaken belief that
the pipe character in a regex is an or statement (like the ||
operator).  While it does operator in a similar fashion, it is not an
or operator.

>>        if ($in_string) {
>>                if ($token eq q/"/) {
>>                        $in_string = 0;
>>                        push @rec, "";
>>                        next;
>>                }
>>        } elsif ($token eq q/,/) {
>>                push @rec, "";
>>                next;
>>        } elsif ($token eq q/"/) {
>>                $in_string = 1;
>>                next;
>>        }
>>        $rec[-1] .= $token;
>
> Is this a commonly used method where you push empty values into an array (if
> $token is a , or ") and append stuff to the last array element (which is an
> empty string)?

It is commonly used by me, I don't know about others. The string at
the end of the record array is not necessarily empty.  In the case of
'"foo,bar"', the tokens are ('"', "foo", ",", "bar", '"').  This means
$rec[-1] starts empty (my @rec = ("");), then "foo" is concatenated
onto it, then ",", then "bar", finally it sees the second '"' token
and pushes a new empty string onto the array.  It looks like there is
a bug though.  It should only push a new string onto the array when it
sees a comma.


-- 
Chas. Owens
wonkden.net
The most important skill a programmer can have is the ability to read.

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About