On Sat Aug 06 14:10:37 2016, aab@purdue.edu wrote: > This is a comment and a possible suggestion. > > > I've been playing around (read a LOT of hacking) with the CPAN > "JavaScript::Beautifier" module to increase its performance and > completeness. In one part of its 'get_next_token()' sub, it extracts > quoted strings. I swiped the (first encountered) perlre(1) example > > /"(?:[^"\\]++|\\.)*+"/ > > as the basis for the regex pattern to use. Unfortunately, during my > testing, I found a javascript file, > https://github.com/ternjs/acorn/blob/master/test/jquery-string.js, > that causes that dread "Complex regular subexpression recursion limit" > error message. I am surprised. I thought the ++ would avoid that. > I next tried variations of the second example > > /"(?>(?:(?>[^"\\]+)|\\.)*)"/ > > with the same result. FWIW - the opening quote in the 'jquery- > string.js' file is on the first line, the closing quote is on last > line, 10314 lines later, and the file is full of \" . > > > I ended up using the regex expression > > /\G((?>$peek.*?(?<!\\)$peek))/s > > where $peek is the quote character. It seems to work fine but I'll > bet that there are probably a bunch of "gotchas" that I haven't > encountered yet. Such as this code snippet: alert("\\" + "\\"); In a browser, that pops up an alert message with two backslashes. If I run your regular expression on it without the \G: $peek = '"'; <<\ENd =~ /((?>$peek.*?(?<!\\)$peek))/s; alert("\\" + "\\"); ENd print "$1\n"; __END__ it gives me: "\\" + " FWIW, JE uses: /\G (?: '([^'\\]*(?:\\.[^'\\]*)*)' | "([^"\\]*(?:\\.[^"\\]*)*)" )/xcgs The basic pattern is: normal* ( special normal* )* -- Father Chrysostomos --- via perlbug: queue: perl5 status: new https://rt.perl.org/Ticket/Display.html?id=128864Thread Previous | Thread Next