(CC'ing the maintainer of Encode: Dan Kogai)
Dan,
Could you please take a look at this bug report and help us determine
if this is intended behaviour, a bug in Encode or a bug in perl?
On Wed Jan 16 04:54:34 2008, perlbugs2008@j3e.de wrote:
>
> -----------------------------------------------------------------
> [Please enter your report here]
>
> if from_to() is called with the check parameter Encode::FB_QUIET it
> should return on errors. With Perl 5.10 this does no longer work
in
> this szenario:
>
> #!/usr/bin/perl
> use Encode 'from_to';
> my $string = "\366"; # this is "o umlaut" in iso-8859-1, invalid utf-
8
> if (from_to($string,utf8,utf8,Encode::FB_QUIET) == undef) {
> print "from_to utf8..utf8 returns undef as is should!\n";
> } else {
> print "from_to utf8..utf8 of non-UTF-8 strings returns NO
> error!\n";
> print "foo: $string\n";
> }
>
> In this case \366 is being converted to \357 \277 \275 by Perl 5.10.
>
> With Perl <= 5.8.8 from_to returned undef which iѕ more reasonable.
>
> -----------------------------------------------------------------
perl-5.8.8 contains Encode v2.12
$ perl-5.8.8 rt-49830.pl
from_to utf8..utf8 returns undef as is should!
perl-5.8.9 contains Encode v2.26
$ perl-5.8.9 rt-49830.pl
from_to utf8..utf8 of non-UTF-8 strings returns NO error!
perl-5.9.2 contains Encode v2.09
$ perl-5.9.2 rt-49830.pl
from_to utf8..utf8 returns undef as is should
perl-5.9.3 contains Encode v2.14
$ perl-5.9.3 rt-49830.pl
from_to utf8..utf8 of non-UTF-8 strings returns NO error!
A binary search:
----EOF ($?='0')----
Will binsearch the lower half
Running the prog '/tmp/rt-49830.pl' for installed-perls/perl/peZnlj8/
perl-5.9.2@26861/bin/perl and installed-perls/perl/pKNe6tf/perl-
5.9.2@26863/bin/perl
----Program----
#!/usr/bin/perl
use Encode 'from_to';
my $string = "\366"; # this is "o umlaut" in iso-8859-1, invalid utf-8
if (not defined from_to($string,utf8,utf8,Encode::FB_QUIET)) {
print "from_to utf8..utf8 returns undef as is should!\n";
} else {
print "from_to utf8..utf8 of non-UTF-8 strings returns NO
error!\n";
# print "foo: $string\n";
}
----Output of .../peZnlj8/perl-5.9.2@26861/bin/perl----
from_to utf8..utf8 returns undef as is should!
----EOF ($?='0')----
----Output of .../pKNe6tf/perl-5.9.2@26863/bin/perl----
from_to utf8..utf8 of non-UTF-8 strings returns NO error!
----EOF ($?='0')----
http://public.activestate.com/cgi-bin/perlbrowse/p/26863
Change 26863 by rgs@stencil on 2006/01/16 14:09:29
Upgrade to Encode 2.14
perl-5.9.2@26861 contains Encode v2.12
perl-5.9.2@26863 contains Encode v2.14
Running it with the latest version of Encode (v2.33) on perl-5.8.8:
$ perl /tmp/rt-49830.pl
from_to utf8..utf8 of non-UTF-8 strings returns NO error!
So this looks like a change in behaviour (maybe intended, maybe not) in
Encode and not in perl.
Best regards,
Bram
Thread Next