develooper Front page | perl.perl5.porters | Postings from November 2016

[perl #130010] v5.25.5-184-ga5540cf breaks texinfo

Thread Previous | Thread Next
From:
James E Keenan via RT
Date:
November 8, 2016 03:36
Subject:
[perl #130010] v5.25.5-184-ga5540cf breaks texinfo
Message ID:
rt-4.0.24-19300-1478576153-1359.130010-15-0@perl.org
On Tue, 08 Nov 2016 01:12:25 GMT, jkeenan wrote:
> On Mon, 07 Nov 2016 16:59:37 GMT, jkeenan wrote:
> > On Mon, 07 Nov 2016 16:29:40 GMT, jkeenan wrote:
> > > On Mon, 07 Nov 2016 13:47:55 GMT, jkeenan wrote:
> > > >
> > > > The question is:  What is it about this the pattern:
> > > >
> > > > #####
> > > > /([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?:[^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)/
> > > > #####
> > > >
> > > > ... that (a) as of commit C<a5540cf> but not previously; and (b)
> > > > in
> > > > the context of this test suite but not in isolation, perceives
> > > > something to be a read-only value not subject to modification?
> > >
> > > My next brainstorm:  Add "use re 'debug';" to sub add_text() in
> > > tp/Texinfo/Convert/ParagraphNonXS.pm.
> > >
> > > When I did so and ran the debugging program found in one of my
> > > previous attachments, I got this output:
> > >
> > > #####
> > > Texinfo::Convert::ParagraphNonXS::add_text(../../tp/Texinfo/Convert/ParagraphNonXS.pm:329):
> > > 329:      my @segments = split
> > > 330:
> > > /([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?:[^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)/,
> > > 331:        $text;
> > >   DB<6> n
> > > Matching REx
> > > "([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?:[^\s\p{InFull"...
> > > against "This is "
> > >    0 <> <This is >           |   0| 1:BRANCH(18)
> > >    0 <> <This is >           |   1|  2:OPEN1(4)
> > >    0 <> <This is >           |   1|  4:PLUS(16)
> > >                              |   1|  ANYOF[\t\n\x0B\f\r \x85][1680
> > > 2000-200A 2028-2029 205F 3000] can match 0 times out of
> > > 2147483647...
> > >                              |   1|  failed...
> > >    0 <> <This is >           |   0| 18:BRANCH(34)
> > >    0 <> <This is >           |   1|  19:OPEN2(21)
> > >    0 <> <This is >           |   1|
> > > 21:ANYOF[+utf8::Texinfo::Convert::ParagraphNonXS::InFullwidth](32)
> > >                              |   1|  failed...
> > >    0 <> <This is >           |   0| 34:BRANCH(68)
> > >    0 <> <This is >           |   1|  35:OPEN3(37)
> > >    0 <> <This is >           |   1|  37:CURLYM[0]{1,INFTY}(66)
> > >    0 <> <This is >           |   2|   39:BRANCH(51)
> > >    0 <> <This is >           |   3|    40:ANYOF[^\t\n\x0B\f\r
> > > \x85\xA0{+utf8::Texinfo::Convert::ParagraphNonXS::InFullwidth}1680
> > > 2000-200A 2028-2029 202F 205F 3000](64)
> > > Modification of a read-only value attempted at
> > > ../../tp/Texinfo/Convert/ParagraphNonXS.pm line 329.
> > >  at ../../tp/Texinfo/Convert/ParagraphNonXS.pm line 329.
> > >         Texinfo::Convert::ParagraphNonXS::add_text(Texinfo::Convert::ParagraphNonXS=HASH(0x35ba938),
> > > "This is ") called at ../../tp/Texinfo/Convert/Info.pm line 308
> > >         Texinfo::Convert::Info::_info_header(Texinfo::Convert::Info=HASH(0x35b3038))
> > > called at ../../tp/Texinfo/Convert/Info.pm line 81
> > >         Texinfo::Convert::Info::output(Texinfo::Convert::Info=HASH(0x35b3038),
> > > HASH(0x35ab7c0)) called at ../texi2any.pl line 1348
> > > panic: POPSTACK
> > > Debugged program terminated.
> > > #####
> > >
> > > Since I have never previously used the regex debugger, I have no
> > > idea
> > > if there are any clues to a solution in that output.
> > >
> >
> > Compiling perl at what has been identified as the last good commit,
> > and then running the test program through the debugger, I got much
> > better output.  It's quite long, so I'm posting it here:
> >
> > https://gist.github.com/jkeenan/184d2aaf914e4aa0410fe2ea1f36da91
> >
> > Thank you very much.
> 
> Yet another gist (actually, excerpts):
> 
> https://gist.github.com/jkeenan/faad48a7d3dfe0c40eab07684388edfb
> 
> At the first "bad" commit, I build perl with -DDEBUGGING.  I reduced
> the invocation of the Perl script from within the texinfo test suite
> that I had been using to the minimum number of command-line switches
> that would still generate the panic.  I got the output in the gist.
> 
> I got similar output when I used the Perl debugger and stepped through
> to the failure point and then, unlike previously I called 's' *into*
> the 'my @segments' line.  But, whatever debugging procedure I use, I
> always seem to end up with output like this:
> 
> #####
> Matching REx
> "([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?:[^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)"
> against "This is "
>    0 <> <This is >           |   0| 1:BRANCH(18)
>    0 <> <This is >           |   1|  2:OPEN1(4)
>    0 <> <This is >           |   1|  4:PLUS(16)
>                              |   1|  ANYOF[\t\n\x0B\f\r \x85][1680
> 2000-200A 2028-2029 205F 3000] can match 0 times out of 2147483647...
>                              |   1|  failed...
>    0 <> <This is >           |   0| 18:BRANCH(34)
>    0 <> <This is >           |   1|  19:OPEN2(21)
>    0 <> <This is >           |   1|
> 21:ANYOF[+utf8::Texinfo::Convert::ParagraphNonXS::InFullwidth](32)
>                              |   1|  failed...
>    0 <> <This is >           |   0| 34:BRANCH(68)
>    0 <> <This is >           |   1|  35:OPEN3(37)
>    0 <> <This is >           |   1|  37:CURLYM[0]{1,INFTY}(66)
>    0 <> <This is >           |   2|   39:BRANCH(51)
>    0 <> <This is >           |   3|    40:ANYOF[^\t\n\x0B\f\r
> \x85\xA0{+utf8::Texinfo::Convert::ParagraphNonXS::InFullwidth}1680
> 2000-200A 2028-2029 202F 205F 3000](64)
> Modification of a read-only value attempted at
> tp/../tp/Texinfo/Convert/ParagraphNonXS.pm line 328.
> panic: POPSTACK
> #####
> 
> ... with no insight into what the read-only value is and where the
> 'modification' is being attempted.
> 
> Thank you very much.

One thing I forgot to mention earlier.  If you step through the program with the Perl debugger and, when you come to this critical line:

#####
my @segments = split
    /([^\S\x{202f}\x{00a0}]+)|(\p{InFullwidth})|((?:[^\s\p{InFullwidth}]|[\x{202f}\x{00a0}])+)/,
    $text;
#####
... and type 's' rather than 'n' (which would immediately trigger the panic), you step into the mysterious world of utf8_heavy.pl and its subroutine STASHNEW().  You eventually get to a point where you have this structure:
#####
  DB<24> x $SWASH
0  utf8=HASH(0x3b8e5f0)
   'BITS' => 1
   'EXTRAS' => '# comment
+utf8::Texinfo::Convert::ParagraphNonXS::InFullwidth
'
   'LIST' => ''
   'NONE' => 0
   'TYPE' => ''
   'USER_DEFINED' => 1
   'utf8::Texinfo::Convert::ParagraphNonXS::InFullwidth' => utf8=HASH(0x3b8dc90)
      'BITS' => 1
      'EXTRAS' => ''
      'LIST' => "1100\cI115F\cJ2329\cI232A\cJ2E80\cI2FFB\cJ3000\cI3000\cJ3001\cI303E\cJ3041\cI33FF\cJ3400\cI4DB5\cJ4E00\cI9FBB\cJA000\cIA4C6\cJAC00\cID7A3\cJF900\cIFAD9\cJFE10\cIFE19\cJFE30\cIFE6B\cJFF01\cIFF60\cJFFE0\cIFFE6\cJ20000\cI2A6D6\cJ2A6D7\cI2F7FF\cJ2F800\cI2FA1D\cJ2FA1E\cI2FFFD\cJ30000\cI3FFFD\cJ"
      'NONE' => 0
      'TYPE' => 'InFullwidth'
      'USER_DEFINED' => 1
#####
... and it is at the *2nd* time you arrive at 'return $SWASH' that the panic occurs.

-- 
James E Keenan (jkeenan@cpan.org)

---
via perlbug:  queue: perl5 status: open
https://rt.perl.org/Ticket/Display.html?id=130010

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About