Gisle, On Tuesday, Oct 7, 2003, at 22:28 Asia/Tokyo, Gisle Aas wrote: > I had a bug report on the MIME::Base64 module because it kind of > depends on the strings passed to its encode() to be NUL-terminated. > This is not always the case for the strings produced by the Encode > module. This program demonstrates: > > #!perl -w > > use Encode qw(encode find_encoding); > use Devel::Peek qw(Dump); > > Dump(encode("UTF-16BE", "abc")); > Dump(find_encoding("UTF-16BE")->encode("abc")); > > With perl-5.8.1 this prints: > > SV = PV(0x819f878) at 0x811f434 > REFCNT = 1 > FLAGS = (TEMP,POK,pPOK) > PV = 0x8189060 "\0a\0b\0c"\0 > CUR = 6 > LEN = 7 > SV = PV(0x819f878) at 0x811f458 > REFCNT = 1 > FLAGS = (TEMP,POK,pPOK) > PV = 0x8194fb0 "\0a\0b\0c" > CUR = 6 > LEN = 6 > > Note that the first form does the right thing while the second does > not. In this particular case I am not sure which side is to blame because perl scalar in general does allow the second form (That's what SvCUR() is for, IMHO). The reason why encode() adds null string is that perl internally adds "\0" whenever it copies string. sub encode($$;$) { my ($name, $string, $check) = @_; return undef unless defined $string; $check ||=0; my $enc = find_encoding($name); unless(defined $enc){ require Carp; Carp::croak("Unknown encoding '$name'"); } my $octets = $enc->encode($string,$check); # HERE! # return undef if ($check && length($string)); return $octets; } Though it is easy to add an extra "\0" for UTF-16 (Done by XS of Encode::Unicode), it is equally easy to fix MIME::Base64. So while I promise to fix this "bug" in Encode::Unicode, I want to fix and tidy other stuff before $Encode::VERSION++. So if you are impatient, I would like you to have your MIME::Base64 take care of this. After all, null-termination itself is moot w/ UTF-(16|32)(BE|LE)?. > Regards, Ditto. Dan the Encode MaintainerThread Previous | Thread Next