develooper Front page | perl.perl5.porters | Postings from May 2009

[perl #49830] Encode from_to() does not return on invalid conversion

Thread Next
From:
Bram via RT
Date:
May 29, 2009 08:56
Subject:
[perl #49830] Encode from_to() does not return on invalid conversion
Message ID:
rt-3.6.HEAD-13988-1243531722-136.49830-15-0@perl.org
(CC'ing the maintainer of Encode: Dan Kogai)

Dan,

Could you please take a look at this bug report and help us determine 
if this is intended behaviour, a bug in Encode or a bug in perl?

On Wed Jan 16 04:54:34 2008, perlbugs2008@j3e.de wrote:
> 
> -----------------------------------------------------------------
> [Please enter your report here]
> 
> if from_to() is called with the check parameter Encode::FB_QUIET it
>    should return on errors. With Perl 5.10 this does no longer work 
in
>    this szenario:
> 
> #!/usr/bin/perl
> use Encode 'from_to';
> my $string = "\366"; # this is "o umlaut" in iso-8859-1, invalid utf-
8
> if (from_to($string,utf8,utf8,Encode::FB_QUIET) == undef) {
>         print "from_to utf8..utf8 returns undef as is should!\n";
> } else {
>         print "from_to utf8..utf8 of non-UTF-8 strings returns NO
>    error!\n";
>         print "foo: $string\n";
> }
> 
> In this case \366 is being converted to \357 \277 \275 by Perl 5.10.
> 
> With Perl <= 5.8.8 from_to returned undef which iѕ more reasonable.
> 
> -----------------------------------------------------------------


perl-5.8.8 contains Encode v2.12
$ perl-5.8.8 rt-49830.pl
from_to utf8..utf8 returns undef as is should!

perl-5.8.9 contains Encode v2.26
$ perl-5.8.9 rt-49830.pl
from_to utf8..utf8 of non-UTF-8 strings returns NO error!


perl-5.9.2 contains Encode v2.09
$ perl-5.9.2 rt-49830.pl
from_to utf8..utf8 returns undef as is should

perl-5.9.3 contains Encode v2.14
$ perl-5.9.3 rt-49830.pl
from_to utf8..utf8 of non-UTF-8 strings returns NO error!


A binary search:
----EOF ($?='0')----
Will binsearch the lower half
Running the prog '/tmp/rt-49830.pl' for installed-perls/perl/peZnlj8/
perl-5.9.2@26861/bin/perl and installed-perls/perl/pKNe6tf/perl-
5.9.2@26863/bin/perl
----Program----
#!/usr/bin/perl

use Encode 'from_to';
my $string = "\366"; # this is "o umlaut" in iso-8859-1, invalid utf-8
if (not defined from_to($string,utf8,utf8,Encode::FB_QUIET)) {
        print "from_to utf8..utf8 returns undef as is should!\n";
} else {
        print "from_to utf8..utf8 of non-UTF-8 strings returns NO 
error!\n";
#        print "foo: $string\n";
}

----Output of .../peZnlj8/perl-5.9.2@26861/bin/perl----
from_to utf8..utf8 returns undef as is should!

----EOF ($?='0')----
----Output of .../pKNe6tf/perl-5.9.2@26863/bin/perl----
from_to utf8..utf8 of non-UTF-8 strings returns NO error!

----EOF ($?='0')----

http://public.activestate.com/cgi-bin/perlbrowse/p/26863
Change 26863 by rgs@stencil on 2006/01/16 14:09:29

	Upgrade to Encode 2.14

perl-5.9.2@26861 contains Encode v2.12
perl-5.9.2@26863 contains Encode v2.14


Running it with the latest version of Encode (v2.33) on perl-5.8.8:
$ perl /tmp/rt-49830.pl
from_to utf8..utf8 of non-UTF-8 strings returns NO error!


So this looks like a change in behaviour (maybe intended, maybe not) in 
Encode and not in perl.


Best regards,

Bram


Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About