On 3/21/07, Jarkko Hietaniemi <jhi@iki.fi> wrote: > The attached patch fixes (or at last papers over) the htmlview.t failure > that was recently introduced by change #30584 [1] (or rather, a bug that > was unearthed by #30584). The failure was seen at least in Tru64 and > HP-UX and only under UTF-8 locales [2], [3]. The failure is (at least > in Tru64, I haven't seen the details in HP-UX) that two of the html item > anchors don't turn out as expected: > > # ! <li><strong><a name="mat" class="item">Mat<!></a></strong> > # ! <li><strong><a name="mat___" class="item">Mat<!></a></strong> > > # ! <li><strong><a name="mat2" class="item">Mat</a></strong> > # ! <li><strong><a name="mat" class="item">Mat</a></strong> > > The patch doesn't fix the bug, it just changes the regex so that the bug > is not hit. The bug itself requires demerphq :-) I tried looking at > whether the [[:punct:]] would be different under locales "C" and > "fi_FI.UTF-8", but not that easy -- the [[:punct:]] are identical. > (Also note how in the second case '2' gets lost.) > > NOTE that perl -C is *not* used, and Pod::Html does no utf8 stuff, > and the $text in fragment_id_readable() does *not* have UTF8 bit on, > but still the locale being a UTF-8 locale changes how things match. > > Attached are my best attempts at finding what is different, namely > the re debug traces from inside the fragement_id_readable() $text > substitute statements, one with locale "C" and one with a UTF-8 locale > when matching with $text as 'Mat<!>': I dont see how to replicate these results to investigate further. Could you help me out by giving me the debug output from: 'Mat<!>'=~/[[:punct:]\s]+/ under both cases please? and or $str='Mat<!>'; $str=~s/[[:punct:]\s]+//g; under use locale and not as well? I cant see any reason that this doesnt work as expected. And when i try it here it does work as expected. :-() Cheers, Yves -- perl -Mre=debug -e "/just|another|perl|hacker/"Thread Previous | Thread Next