develooper Front page | perl.perl5.porters | Postings from March 2007

[PATCH] lib/Pod/ plus a funky UT8-8 regex bug

Thread Next
Jarkko Hietaniemi
March 20, 2007 20:05
[PATCH] lib/Pod/ plus a funky UT8-8 regex bug
Message ID:
The attached patch fixes (or at last papers over) the htmlview.t failure
that was recently introduced by change #30584 [1] (or rather, a bug that
was unearthed by #30584).  The failure was seen at least in Tru64 and
HP-UX and only under UTF-8 locales [2], [3].  The failure is (at least
in Tru64, I haven't seen the details in HP-UX) that two of the html item
anchors don't turn out as expected:

# ! <li><strong><a name="mat" class="item">Mat&lt;!&gt;</a></strong>
# ! <li><strong><a name="mat___" class="item">Mat&lt;!&gt;</a></strong>

# ! <li><strong><a name="mat2" class="item">Mat</a></strong>
# ! <li><strong><a name="mat" class="item">Mat</a></strong>

The patch doesn't fix the bug, it just changes the regex so that the bug
is not hit.  The bug itself requires demerphq :-)  I tried looking at
whether the [[:punct:]] would be different under locales "C" and
"fi_FI.UTF-8", but not that easy -- the [[:punct:]] are identical.
(Also note how in the second case '2' gets lost.)

NOTE that perl -C is *not* used, and Pod::Html does no utf8 stuff,
and the $text in fragment_id_readable() does *not* have UTF8 bit on,
but still the locale being a UTF-8 locale changes how things match.

Attached are my best attempts at finding what is different, namely
the re debug traces from inside the fragement_id_readable() $text
substitute statements, one with locale "C" and one with a UTF-8 locale
when matching with $text as 'Mat<!>':


Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About