Front page | perl.perl5.porters |
Postings from February 2012
[perl #111446] pod2html generates illegal UTF-8
From:
tchrist1
Date:
February 29, 2012 12:57
Subject:
[perl #111446] pod2html generates illegal UTF-8
Message ID:
rt-3.6.HEAD-4610-1330549022-831.111446-75-0@perl.org
# New Ticket Created by tchrist1
# Please include the string: [perl #111446]
# in the subject line of all future correspondence about this issue.
# <URL: https://rt.perl.org:443/rt3/Ticket/Display.html?id=111446 >
pod2html generates illegal UTF-8 because it creates HTML pages that
claim to be UTF-8:
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
But then generates strings afflicted with the Unicode bug. Code points 128-255
come out as simple illegal bytes, unless there's a larger code point in them.
The right fix is to binmode the output handle to :utf8.
Here's a list of pages to test. Note that you won't get a wide char warning
if it is only 128-255; you'll simply get illegal output.
perlebcdic.pod
perlgit.pod
perlhist.pod
perlpodspec.pod
perlthrtut.pod
perl588delta.pod
perl5100delta.pod
perl5120delta.pod
perl5121delta.pod
perl5122delta.pod
perl5123delta.pod
perl5124delta.pod
perl5140delta.pod
perl5141delta.pod
perl5142delta.pod
perl5150delta.pod
perl5151delta.pod
perl5152delta.pod
perl5153delta.pod
perl5154delta.pod
perl5156delta.pod
perl5157delta.pod
perl5158delta.pod
perlcn.pod
perljp.pod
perlko.pod
perltw.pod
Notice also that you get differently wrong answers running with PERL_UNICODE
set to 0 vs to SD. The program should not be sensitive to whether
that variable is set, because it knows the encodings of its input and
output, and should set things accordingly.
--tom
Summary of my perl5 (revision 5 version 14 subversion 0) configuration:
Platform:
osname=openbsd, osvers=4.4, archname=OpenBSD.i386-openbsd
uname='openbsd chthon 4.4 generic#0 i386 '
config_args='-des'
hint=recommended, useposix=true, d_sigaction=define
useithreads=undef, usemultiplicity=undef
useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
use64bitint=undef, use64bitall=undef, uselongdouble=undef
usemymalloc=y, bincompat5005=undef
Compiler:
cc='cc', ccflags ='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include',
optimize='-O2',
cppflags='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
ccversion='', gccversion='3.3.5 (propolice)', gccosandvers='openbsd4.4'
intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
alignbytes=4, prototype=define
Linker and Libraries:
ld='cc', ldflags ='-Wl,-E -fstack-protector -L/usr/local/lib'
libpth=/usr/local/lib /usr/lib
libs=-lgdbm -lm -lutil -lc
perllibs=-lm -lutil -lc
libc=/usr/lib/libc.so.48.0, so=so, useshrplib=false, libperl=libperl.a
gnulibc_version=''
Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags=' '
cccdlflags='-DPIC -fPIC ', lddlflags='-shared -fPIC -L/usr/local/lib -fstack-protector'
Characteristics of this binary (from libperl):
Compile-time options: MYMALLOC PERL_DONT_CREATE_GVSV PERL_MALLOC_WRAP
PERL_PRESERVE_IVUV USE_LARGE_FILES USE_PERLIO
USE_PERL_ATOF
Built under openbsd
Compiled at Jun 11 2011 11:48:28
%ENV:
PERL_UNICODE="SA"
@INC:
/usr/local/lib/perl5/site_perl/5.14.0/OpenBSD.i386-openbsd
/usr/local/lib/perl5/site_perl/5.14.0
/usr/local/lib/perl5/5.14.0/OpenBSD.i386-openbsd
/usr/local/lib/perl5/5.14.0
/usr/local/lib/perl5/site_perl/5.12.3
/usr/local/lib/perl5/site_perl/5.11.3
/usr/local/lib/perl5/site_perl/5.10.1
/usr/local/lib/perl5/site_perl/5.10.0
/usr/local/lib/perl5/site_perl/5.8.7
/usr/local/lib/perl5/site_perl/5.8.0
/usr/local/lib/perl5/site_perl/5.6.0
/usr/local/lib/perl5/site_perl/5.005
/usr/local/lib/perl5/site_perl
.
-
[perl #111446] pod2html generates illegal UTF-8
by tchrist1