develooper Front page | perl.perl5.porters | Postings from February 2012

Re: [perl #109798] '/e' regexp modifier is not recognized by re pragma

Thread Previous | Thread Next
From:
Tom Christiansen
Date:
February 4, 2012 17:24
Subject:
Re: [perl #109798] '/e' regexp modifier is not recognized by re pragma
Message ID:
21953.1328405072@chthon
"Eric Brine via RT" <perlbug-followup@perl.org> wrote
   on Sat, 04 Feb 2012 16:19:41 PST: 

> Also note that "e" is not mentioned in perlre. The regex flags
> are (as listed in perlre): m, s, i, x, p, g, c, a, d, l, u.

That's incorrect, and potentially misleading.

With blank lines removed for brevity, these are 
the only pattern modifiers:

    =head2 Pattern Modifiers
    =begin table picture Regular expression modifiers
    =headrow
    =row
    =cell Modifier
    =cell Meaning
    =bodyrows
    =row
    =cell C</i>
    =cell Ignore alphabetic case distinctions (case insensitive).
    =row
    =cell C</s>
    =cell Let C<.> also match newline.
    =row
    =cell C</m>
    =cell Let C<^> and C<$> also match next to embedded C<\n>.
    =row
    =cell C</x>
    =cell Ignore (most) whitespace and permit comments in pattern.
    =row
    =cell
    =row
    =cell C</o>
    =cell Compile pattern once only.
    =row
    =cell C</p>
    =cell Preserve C<${^PREMATCH}>, C<${^MATCH}>, and C<${^POSTMATCH}> variables.
    =row
    =cell
    =row
    =cell C</d>
    =cell Dual ASCII–Unicode mode charset behavior (old default)
    =row
    =cell C</a>
    =cell ASCII charset behavior
    =row
    =cell C</u>
    =cell Unicode charset behavior (new default)
    =row
    =cell C</l>
    =cell the run-time locale’s charset behavior (default under R<use locale>)
    =end table

That's all.  Those apply to patterns.

These flags, however, apply to the match operator, which is different:


    =head2 The m// Operator (Matching)
    =begin table picture m// Modifiers
    =headrow
    =row
    =cell Modifier
    =cell Meaning
    =bodyrows
    =row
    =cell C</i>
    =cell Ignore alphabetic case.
    =row
    =cell C</m>
    =cell Let C<^> and C<$> also match next to embedded C<\n>.
    =row
    =cell C</s>
    =cell Let C<.> also match newline.
    =row
    =cell C</x>
    =cell Ignore (most) whitespace and permit comments in pattern.
    =row
    =cell C</o>
    =cell Compile pattern once only.
    =row
    =cell C</p>
    =cell Preserve the matched string.
    =row
    =cell
    =row
    =cell C</d>
    =cell Dual ASCII–Unicode mode charset behavior (old default).
    =row
    =cell C</u>
    =cell Unicode charset behavior (new default).
    =row
    =cell C</a>
    =cell ASCII charset behavior
    =row
    =cell C</l>
    =cell The run-time locale’s charset behavior (default under R<use locale>).
    =row
    =cell
    =row
    =cell C</g>
    =cell Globally find all matches.
    =row
    =cell C</cg>
    =cell Allow continued search after failed C</g> match.
    =end table

Here are the s/// flags:


    =head2 The s/// Operator (Substitution)
    =begin table picture s/// Modifiers
    =headrow
    =row
    =cell Modifier
    =cell Meaning
    =bodyrows
    =row
    =cell C</i>
    =cell Ignore alphabetic case (when matching).
    =row
    =cell C</m>
    =cell Let C<^> and C<$> also match next to embedded C<\n>.
    =row
    =cell C</s>
    =cell Let C<.> also match newline.
    =row
    =cell C</x>
    =cell Ignore (most) whitespace and permit comments in pattern.
    =row
    =cell C</o>
    =cell Compile pattern once only.
    =row
    =cell C</p>
    =cell Preserve the matched string.
    =row
    =cell
    =row
    =cell C</d>
    =cell Dual ASCII–Unicode mode charset behavior (old default).
    =row
    =cell C</u>
    =cell Unicode charset behavior (new default).
    =row
    =cell C</a>
    =cell ASCII charset behavior.
    =row
    =cell C</l>
    =cell The run-time locale’s charset behavior (default under R<use locale>).
    =row
    =cell
    =row
    =cell C</g>
    =cell Replace globally, that is, all occurrences.
    =row
    =cell C</r>
    =cell Return substitution and leave the original string untouched.
    =row
    =cell C</e>
    =cell Evaluate the right side as an expression.
    =end table

And tr/// has its own as well:

    =begin table picture tr/// Modifiers
    =headrow
    =row
    =cell Modifier
    =cell Meaning
    =bodyrows
    =row
    =cell C</c>
    =cell Complement R<SEARCHLIST>.X</c pattern modifier:c5 pattern>
    =row
    =cell C</d>
    =cell Delete found but unreplaced characters.X</d pattern modifier:d pattern>
    =row
    =cell C</s>
    =cell Squash duplicate replaced characters.X</s pattern modifier:s pattern>
    =row
    =cell C</r>
    =cell Return transliteration and leave the original string untouched.
    =end table

See?  Four distinct sets of flags.

> (I find this very odd that p, g and c are regex flags instead
> of match substitute operator flags, but they are are.)

You should find it odd, because they aren't.

    $ perl -E 'use re "/e"; say "ok"'
    Unknown regular expression flag "e" at -e line 1
    ok

    $ perl -E 'use re "/g"; say "ok"'
    Unknown regular expression flag "g" at -e line 1
    ok

    $ perl -E 'use re "/c"; say "ok"'
    Unknown regular expression flag "c" at -e line 1
    ok

    $ perl -E 'use re "/p"; say "ok"'
    ok

What's going on with the first three is that it isn't properly dying.

    $ perl -Mwarnings=FATAL,all -E 'use re "/g"; say "ok"'
    Unknown regular expression flag "g" at -e line 1
    ok

That's a bug.

What's going on with the last one is more subtle.  /p can be 
embedded in a pattern with (?p), but it cannot be turned off
once turned on.

Anyway, probably perlre should be clearer (read: more correct)
about the s/uper/man/mxyzptlk flags, and that jazz.

--tom

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About