* tomushkin@gmail.com (perlbug-followup@perl.org) [110904 21:52]:
> # New Ticket Created by tomushkin@gmail.com
> # Please include the string: [perl #98370]
> # in the subject line of all future correspondence about this issue.
> # <URL: https://rt.perl.org:443/rt3/Ticket/Display.html?id=98370 >
>
> # perl -MEncode -e 'decode("IBM037", "\x{a2}\x{97}\x{81}\x{94}")'
> Unknown encoding 'IBM037' at -e line 1
With the attached simple script, you can find all missing encodings from
the IANA official list. Lines which end-up with a leading '*' are missing,
with '+'...
present from IANA: 125 charsets = 66 names, 59 aliases
missing from IANA: 704 charsets = 190 names, 514 aliases
So, there is a chance you encounter character-sets that Perl does not
understand. Amongst them many IBM* sets.
-----
#!/usr/bin/perl
use warnings;
use strict;
use Encode qw/find_encoding/;
sub status($);
open GET, "wget http://www.iana.org/assignments/character-sets --output-document=- |"
or die $!;
while(<GET>)
{
if( m/(Name|Alias)\:\s+(\S+)/ )
{ my $status = status $2;
s/^/status/;
}
else
{ s/^/ /;
}
print;
}
sub status($)
{ my $charset = shift;
$charset =~ m/^none$/i and return ' ';
my $enc = find_encoding $charset;
defined $enc ? '+' : '*';
}
Thread Previous
|
Thread Next