On 25 June 2013 20:26, demerphq <demerphq@gmail.com> wrote: > On 25 June 2013 16:55, Michael Schroeder <mls@suse.de> wrote: >> >> Hi Porters, >> >> commit #726ee55d breaks matching of \8 and \9 if they come after a >> literal: >> >> use re 'debug'; >> my $a = '(((((((((x)))))))))foo\9'; >> my $b = 'xfoox'; >> $b =~ /$a/; >> >> Output: >> >> Final program: >> 1: OPEN1 (3) >> 3: OPEN2 (5) >> 5: OPEN3 (7) >> 7: OPEN4 (9) >> 9: OPEN5 (11) >> 11: OPEN6 (13) >> 13: OPEN7 (15) >> 15: OPEN8 (17) >> 17: OPEN9 (19) >> 19: EXACT <x> (21) >> 21: CLOSE9 (23) >> 23: CLOSE8 (25) >> 25: CLOSE7 (27) >> 27: CLOSE6 (29) >> 29: CLOSE5 (31) >> 31: CLOSE4 (33) >> 33: CLOSE3 (35) >> 35: CLOSE2 (37) >> 37: CLOSE1 (39) >> 39: EXACT <foo9> (41) >> 41: END (0) >> >> Note the "foo9" exact match. A workaround is to use \g9, of course, >> but the perlre man page says: "C<\1> through C<\9> are always >> interpreted as backreferences". >> >> (The change breaks the latex2html package, btw.) > > Thanks for the report. I agree this is a bug. I am looking into a fix. I ended up pushing the following: f1e1b256c5c1773d90e828cca6323c53fa23391b which makes multidigit backslash escapes illegal when they start with 8 or 9 and are larger than the number of capture buffers in the string. IOW, /\87/ is a fatal error and not /\x{00}87/ nor /87/ with a warning. My rationale for this is we have two precedents to consider: a) a case like /\9/ where we would die with an error about a backreference to a non-existent buffer. b) a case like "\9" where we would warn, and then treat the escape as "9". IMO the precedent for the regex wins over the precedent of the double quoted string. The rules for handling backreferences are pretty arcane. \118 could mean the 118th capture buffer, if it exists, or it could mean "\x{09}8". In other words not only do we change the base we interpret it in, we also change the number of digits we consider part of the escape! This patch does not change this behavior, and affects only escapes starting with an 8 or 9 as they have no reasonable interpretation as octal, but do have reasonable interpretations as back references. I personally think maybe we should warn on something like \118, but i leave that debate for another day. cheers, Yves ps: Karl too worked on a fix for this, but i got mine wrapped up a bit quicker. He may push follow up patches. -- perl -Mre=debug -e "/just|another|perl|hacker/"Thread Previous | Thread Next