develooper Front page | perl.perl5.porters | Postings from May 2012

Re: "\" does not escape meta chars if also delim

May 29, 2012 13:23
Re: "\" does not escape meta chars if also delim
Message ID:
On 29 mai, 08:39, ("Eric Brine") wrote:
> About "\", perlre says:
> "Quote the next metacharacter."
> "So anything that looks like \\, \(, \), \<, \>, \{, or \} is always
> interpreted as a literal character, not a metacharacter."
> "Any single character matches itself, unless it is a metacharacter with a
> special meaning described here or above. You can cause characters that
> normally function as metacharacters to be interpreted literally by
> prefixing them with a "\" (e.g., "\." matches a ".", not any character;
> "\\" matches a "\"). This escape mechanism is also required for the
> character used as the pattern delimiter."
> Yet when "\" is used to escape a char that's both delimiter and meta, the
> escaped character doesn't cease being meta as documented.
> >perl -E" say 'a' =~ m/\./ ? 'XXX' : 'ok' "
> ok
> >perl -E" say 'a' =~ m.\.. ? 'XXX' : 'ok' "

I would consider this a bug.

As far as I can tell, this comes from the over-simplified / wrong
quoting/parsing of regular expressions.

I might be wrong, but here is an example of how I think regular
expressions get processed, that involves { } as delimiters of that
regular expression:

imagine a regexp that matches 2 a's, followed by a literal \{, a
literal 2 and a literal \}:


This regexp gets pre-processed (and here is where, in my opinion, the
damage occurs) to replace any non-escaped { or } into an escaped \{ \}

This results in the following:


Now the delimiters m{...} get removed


and finally all escaped delimiters are "unescaped"


which results in a regexp /a{2}a{2}/ that matches 4 consecutive 'a's

That's how I think regular expressions are handled, but I think this
is a bug. Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About