develooper Front page | perl.perl5.porters | Postings from July 2014

Encode vs. JSON

Thread Next
From:
David E. Wheeler
Date:
July 16, 2014 22:03
Subject:
Encode vs. JSON
Message ID:
16EFBF9C-D8DC-4D4A-B78A-46C153CFA49A@justatheory.com
Porters,

I have a script:

    use v5.10;
    use warnings;
    use JSON;
    use Encode qw(encode_utf8 decode_utf8);

    my $json = qq{{"FFONTS":"HOLIDAYBOLDI\xEF\xBF\xBFALIC"}};
    my $parser = JSON->new->utf8;

    my $data = $parser->decode($json);
    say encode_utf8 $data->{FFONTS};

On Perl 5.12 and earlier, this dies:

    malformed UTF-8 character in JSON string, at character offset 23 (before "\x{ffff}ALIC"}")

It does not die on 5.14, which I assume is due to the addition of Unicode 6 support. But oddly, while JSON complains on 5.12 and earlier, Encode does not:

    use v5.10;
    use warnings;
    use JSON;
    use Encode qw(encode_utf8 decode_utf8);

    my $json = qq{{"FFONTS":"HOLIDAYBOLDI\xEF\xBF\xBFALIC"}};
    $json = decode_utf8 $json, Encode::FB_CROAK;

    my $parser = JSON->new;

    my $data = $parser->decode($json);
    say encode_utf8 $data->{FFONTS};

This dies with the same error from JSON.pm, but note that the call to decode_utf8() worked. I’m left wondering why JSON and Encode seem to disagree on the validity of those bytes as UTF-8 in Perl 5.12. Ideas?

Thanks,

David


Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About