develooper Front page | perl.perl5.porters | Postings from January 2019

Re: [perl #133756] //g flag on regex with UTF-8 source causes regexoptimiser to wrongly reject a match

Thread Previous | Thread Next
From:
Nicholas Clark
Date:
January 9, 2019 17:14
Subject:
Re: [perl #133756] //g flag on regex with UTF-8 source causes regexoptimiser to wrongly reject a match
Message ID:
20190109171421.xhiuayzorc64ea3p@ceres.etla.org
On Wed, Jan 09, 2019 at 09:49:59AM -0700, Karl Williamson wrote:

> My gvim syntax highlighter immediately showed that \x100 is \x10 followed by
> a "0".  Without that, I would have expected that $char contained a single
> character: \x{100}.  The /g would cause the second character, the "0"
> (U+0030) to be attempted to be matched.  I haven't investigated further,
> because my guess is that is what is going on here.  If you say there is more
> to it, then I'll investigate further.

Thanks for the rapid response. I think that you might be right, but will
investigate further tomorrow at work with a fresh head.

This would mean that I've failed to correctly reduce the original problem
to a representitive test case. (The problem might *still* be PEBKAC, but
the original discrepency in the much larger input and generated regex looked
like a bug.)

Nicholas Clark

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About