develooper Front page | perl.perl5.porters | Postings from August 2016

[perl #128864] perlre(1) and paired double quote regex searches --

Thread Previous | Thread Next
From:
Father Chrysostomos via RT
Date:
August 6, 2016 23:10
Subject:
[perl #128864] perlre(1) and paired double quote regex searches --
Message ID:
rt-4.0.24-13765-1470525014-186.128864-15-0@perl.org
On Sat Aug 06 14:10:37 2016, aab@purdue.edu wrote:
> This is a comment and a possible suggestion.
> 
> 
> I've been playing around (read a LOT of hacking) with the CPAN
> "JavaScript::Beautifier" module to increase its performance and
> completeness.  In one part of its 'get_next_token()' sub, it extracts
> quoted strings.  I swiped the (first encountered) perlre(1) example
> 
> /"(?:[^"\\]++|\\.)*+"/
> 
> as the basis for the regex pattern to use.  Unfortunately, during my
> testing, I found a javascript file,
> https://github.com/ternjs/acorn/blob/master/test/jquery-string.js,
> that causes that dread "Complex regular subexpression recursion limit"
> error message.

I am surprised.  I thought the ++ would avoid that.

>  I next tried variations of the second example
> 
> /"(?>(?:(?>[^"\\]+)|\\.)*)"/
> 
> with the same result.  FWIW - the opening quote in the 'jquery-
> string.js' file is on the first line, the closing quote is on last
> line, 10314 lines later, and the file is full of \" .
> 
> 
> I ended up using the regex expression
> 
> /\G((?>$peek.*?(?<!\\)$peek))/s
> 
> where $peek is the quote character.  It seems to work fine but I'll
> bet that there are probably a bunch of "gotchas" that I haven't
> encountered yet.

Such as this code snippet:

    alert("\\" + "\\");

In a browser, that pops up an alert message with two backslashes.  If I run your regular expression on it without the \G:

$peek = '"';
<<\ENd =~ /((?>$peek.*?(?<!\\)$peek))/s;
alert("\\" + "\\");
ENd
print "$1\n";
__END__

it gives me:

"\\" + "

FWIW, JE uses:

		/\G (?: '([^'\\]*(?:\\.[^'\\]*)*)'
		          |
		        "([^"\\]*(?:\\.[^"\\]*)*)"  )/xcgs

The basic pattern is:

    normal* ( special normal* )*

-- 

Father Chrysostomos


---
via perlbug:  queue: perl5 status: new
https://rt.perl.org/Ticket/Display.html?id=128864

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About