Front page | perl.perl5.porters |
Postings from May 2012
Oh dear, maybe we have to rethink 'Unescaped left brace in regex isdeprecated' warnings...
From:
demerphq
Date:
May 28, 2012 01:32
Subject:
Oh dear, maybe we have to rethink 'Unescaped left brace in regex isdeprecated' warnings...
Message ID:
CANgJU+VUoYWT78v4e7nDpD2NfM3Dp41NpXY=soOxbP14jkV-tA@mail.gmail.com
On 28 May 2012 09:50, Andreas J. Koenig
<andreas.koenig.7os6VVqR@franz.ak.mind.de> wrote:
> Found in GRANTM/XML-Simple-2.18.tar.gz in lib/XML/Simple.pm:
>
> 995 $val =~ s{\$\{([\w.]+)\}}{ $self->get_var($1) }ge;
> 1031 $val =~ s{\$\{(\w+)\}}{ $self->get_var($1) }ge;
>
> % make test
> [...]
> t/0_Config.t .. Unescaped left brace in regex is deprecated, passed through in regex; marked by <-- HERE in m/\${ <-- HERE ([\w.]+)}/ at /tmp/tmp.4uMUAQPZaT/XML-Simple-2.18-cUi3UY/blib/lib/XML/Simple.pm line 995.
> Unescaped left brace in regex is deprecated, passed through in regex; marked by <-- HERE in m/\${ <-- HERE (\w+)}/ at /tmp/tmp.4uMUAQPZaT/XML-Simple-2.18-cUi3UY/blib/lib/XML/Simple.pm line 1031.
> # Package Version
> # perl 5.17.0
> # XML::Simple 2.18
> # Storable 2.35
> # XML::Parser 2.41
> # XML::SAX 0.99
> # XML::NamespaceSupport 1.11
> t/0_Config.t .. ok
>
>
> Look like perl miscounts one backslash. I would expect that the regexp
> is accepted by perl because the brace is escaped.
I think some insight on this bug can be obtained from perlop in the
section titled "Gory details of parsing quoted constructs". Be warned
this section is extremely difficult to follow.
Basically since the pattern uses {} "quotes", the regex engine will
never see an escaped '{'.
The part that explains this is the following:
The lack of processing of "\\" creates specific
restrictions on the post-processed text. If the delimiter is "/", one
cannot get
the combination "\/" into the result of this step. "/"
will finish the regular expression, "\/" will be stripped to "/" on
the
previous step, and "\\/" will be left as is. Because
"/" is equivalent to "\/" inside a regular expression, this does not
matter
unless the delimiter happens to be character special to
the RE engine, such as in "s*foo*bar*", "m[foo]", or "?foo?"; or an
alphanumeric char, as in:
So it seems to me there is no way to avoid this warning when using
'{}' quotes in a regex, Which is a pity and IMO means we have to
rethink the warning and probably get rid of it unless we decide to
rework how perl "unescapes" the content of regex patterns during the
parse process.
This is by the way yet another manifestation of a recurring problem in
Perl's "string" parsing logic. We have seen these time and again, with
for instance \Q and \E and with all the "fun" of "moving" parsing of
\x{...} from the lexer to the regex engine, issues with named
characters, etc. The issues are subtle and deep and not easily
resolved in any regard.
I personally think this is an area that would be well worth focusing
some dedicated effort on. I know for a fact that parsing strings has
non-linear performance with regard to the length of the string, and
have some reason to suspect that parsing strings is one of the slowest
aspects of the "eval" procedure. So I think there would be a lot of
benefit in making improvements to this code.
I have CC'ed a number of people to this mail because I think this is a
"deep core" issue that needs to be thought about at a level above our
usual "post a patch and scratch your itch".
cheers,
Yves
--
perl -Mre=debug -e "/just|another|perl|hacker/"
-
Oh dear, maybe we have to rethink 'Unescaped left brace in regex isdeprecated' warnings...
by demerphq