develooper Front page | perl.perl5.porters | Postings from October 2012

[perl #115262] PerlIO::encoding produces malformed utf8

Thread Next
From:
Father Chrysostomos via RT
Date:
October 14, 2012 14:51
Subject:
[perl #115262] PerlIO::encoding produces malformed utf8
Message ID:
rt-3.6.HEAD-29115-1350251455-16.115262-14-0@perl.org
On Sun Oct 14 14:49:40 2012, sprout wrote:
> PerlIO::encoding passes invalid strings to encoding implementations.

A local mail server seemed to think this was spam and refused to send
the message until I had deleted the body.  Here it is in full:


use Encode::Encoding;
package footf8 {
  @ISA = Encode::Encoding;
 __PACKAGE__->Define('foo-tf8');
  sub encode($$;$) {
    my ($self, $buf, $chk) = @_;
    use Devel::Peek;
    Dump $buf;
    undef $_[1] if $chk;
    utf8::encode $buf;
    $buf
  }
}
open $fh, ">encoding(foo-tf8)", \$s;
print $fh "a"x1023 . chr 256;
__END__

That script dumps two malformed scalars, because the output is split in
the middle of chr 256.

Encode::CN::HZ actually expects this and uses some arcane Perl code
(which looks straightforward, but you have to know internals to
understand it) to work around it.

Other pure-Perl encoding implementations included with Encode.pm don’t work:

open $fh, ">encoding(utf-7)", \$s;
print $fh "a"x1023 . chr 256;
__END__

That produces malformed UTF8 messages.

PerlIO::encoding should be caching the partial characters instead of
passing them to Perl code.

---
Flags:
    category=core
    severity=low
---
Site configuration information for perl 5.17.5:

Configured by sprout at Sat Sep 22 18:51:23 PDT 2012.

Summary of my perl5 (revision 5 version 17 subversion 5) configuration:
  Snapshot of: 451f421fe4742646fa2efbed0f45a19f0713d00f
  Platform:
    osname=darwin, osvers=10.5.0, archname=darwin-2level
    uname='darwin pint.local 10.5.0 darwin kernel version 10.5.0: fri
nov 5 23:20:39 pdt 2010; root:xnu-1504.9.17~1release_i386 i386 '
    config_args='-de -Dusedevel -DDEBUGGING'
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=undef, usemultiplicity=undef
    useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
    use64bitint=undef, use64bitall=undef, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-fno-common -DPERL_DARWIN -DDEBUGGING
-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include',
    optimize='-O3 -g',
    cppflags='-fno-common -DPERL_DARWIN -DDEBUGGING -fno-strict-aliasing
-pipe -fstack-protector -I/usr/local/include'
    ccversion='', gccversion='4.2.1 (Apple Inc. build 5664)',
gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t',
lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='env MACOSX_DEPLOYMENT_TARGET=10.3 cc', ldflags ='
-fstack-protector -L/usr/local/lib'
    libpth=/usr/local/lib /usr/lib
    libs=-ldbm -ldl -lm -lutil -lc
    perllibs=-ldl -lm -lutil -lc
    libc=, so=dylib, useshrplib=false, libperl=libperl.a
    gnulibc_version=''
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=bundle, d_dlsymun=undef, ccdlflags=' '
    cccdlflags=' ', lddlflags=' -bundle -undefined dynamic_lookup
-L/usr/local/lib -fstack-protector'

Locally applied patches:
    

---
@INC for perl 5.17.5:
    /usr/local/lib/perl5/site_perl/5.17.5/darwin-2level
    /usr/local/lib/perl5/site_perl/5.17.5
    /usr/local/lib/perl5/5.17.5/darwin-2level
    /usr/local/lib/perl5/5.17.5
    /usr/local/lib/perl5/site_perl
    .

---
Environment for perl 5.17.5:
    DYLD_LIBRARY_PATH (unset)
    HOME=/Users/sprout
    LANG=en_US.UTF-8
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
   
PATH=/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/usr/local/bin
    PERL_BADLANG (unset)
    SHELL=/bin/bash


-- 

Father Chrysostomos


Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About