On Sun, Oct 14, 2012 at 11:50 PM, Father Chrysostomos via RT <perlbug-comment@perl.org> wrote: > use Encode::Encoding; > package footf8 { > @ISA = Encode::Encoding; > __PACKAGE__->Define('foo-tf8'); > sub encode($$;$) { > my ($self, $buf, $chk) = @_; > use Devel::Peek; > Dump $buf; > undef $_[1] if $chk; > utf8::encode $buf; > $buf > } > } > open $fh, ">encoding(foo-tf8)", \$s; > print $fh "a"x1023 . chr 256; > __END__ > > That script dumps two malformed scalars, because the output is split in > the middle of chr 256. > > Encode::CN::HZ actually expects this and uses some arcane Perl code > (which looks straightforward, but you have to know internals to > understand it) to work around it. > > Other pure-Perl encoding implementations included with Encode.pm don’t work: > > open $fh, ">encoding(utf-7)", \$s; > print $fh "a"x1023 . chr 256; > __END__ > > That produces malformed UTF8 messages. > > PerlIO::encoding should be caching the partial characters instead of > passing them to Perl code. Yeah, this is the general design of the system. PerlIO doesn't do characters, it does bytes. While you're right it could emulate character semantics in Write(), it wouldn't be able to do the same in Read() in variable-length encodings anyway, so the point is a bit moot. LeonThread Previous | Thread Next