develooper Front page | perl.perl5.porters | Postings from August 2009

Re: [perl #68804] underscore regex delimiters

Thread Previous | Thread Next
From:
demerphq
Date:
August 27, 2009 05:04
Subject:
Re: [perl #68804] underscore regex delimiters
Message ID:
9b18b3110908270503n4caa3278i7a320e95d79cff9b@mail.gmail.com
2009/8/27 Abigail <abigail@abigail.be>:
> On Thu, Aug 27, 2009 at 09:33:13AM +0100, Zefram wrote:
>> chip@seas.upenn.edu (via RT) wrote:
>> >$ perl -p -e 's_/32__;'
>>
>> The underscore is not perceived as a delimiter there, but as part of
>> an identifier.  Observe how it was parsed:
>>
>> $ perl -MO=Deparse -e 's#/32##;'
>> s[/32][];
>> -e syntax OK
>> $ perl -MO=Deparse -e 's_/32__;'
>> 's_' / 32;
>> -e syntax OK
>>
>> Other clues are available if you turn on warnings:
>>
>> $ perl -wce 's_/32__;'
>> Misplaced _ in number at -e line 1.
>> Misplaced _ in number at -e line 1.
>> Useless use of division (/) in void context at -e line 1.
>> -e syntax OK
>>
>> And if the substitution had contained other text then it would have
>> blown up earlier:
>>
>> $ perl -wce 's_\\__;'
>> Backslash found where operator expected at -e line 1, near "s_\"
>> syntax error at -e line 1, near "s_\"
>> -e had compilation errors.
>>
>> I believe the documentation is at fault.  perlop(1) says:
>>
>>     Any non-alphanumeric, non-whitespace delimiter may replace the
>>     slashes.
>>
>> Underscore is not alphanumeric or whitespace, but is evidently
>> being treated the same way that an alphanumeric character would be.
>> The prohibition is really on identifier characters, not alphanumerics.

The doocumentation should read "non-perl-word, non whitespace delimiter"

We use "alphhanumeric" fairly regularly when we speak of
"perl-word-characters" or "identifier" (which cannot be expressed by a
character class as it cannot start  with a number, yet can end with
one).

>
>
> But even then, the documentation isn't correct. You *can* use word characters
> as delimiters:
>
>   s _/32__
>
> and
>
>   s s/32ss
>
> are fine.
>
> The problem here lies in the tokenization part: the first token of
> C<< s_/32__ >> is C<< s_ >>, which isn't the substitution operator.
>
> In fact, this issue isn't any different from saying you cannot use C<< _ >>
> as a function argument because you wrote:
>
>    C<< func_ >>
>
> which isn't parsed as C<< func (_) >> either.

Right. This is not a bug.

Cheers,
Yves


-- 
perl -Mre=debug -e "/just|another|perl|hacker/"

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About