develooper Front page | perl.libwww | Postings from September 2001

Re: [Problem] with LWP, unicode/multibyte chars and Perl 5.6.1 and later

Thread Previous
From:
Gisle Aas
Date:
September 6, 2001 13:17
Subject:
Re: [Problem] with LWP, unicode/multibyte chars and Perl 5.6.1 and later
Message ID:
lrlmjs3whz.fsf@caliper.ActiveState.com
Nick Ing-Simmons <nick@ing-simmons.net> writes:

> Gisle Aas <gisle@ActiveState.com> writes:
> >Paul Kulchenko <paulclinger@yahoo.com> writes:
> >
> >> [Problem]
> >> 1. LWP::Protocol::http and others use length() to calculate
> >> content-length and in Perl 5.6.1 and later length() calculates chars
> >> instead of bytes. It means that every request that has multibyte
> >> chars in it will have wrong content-length and other side will read
> >> less bytes than required. 
> >
> >I my view it is a bug to put content containing chars with ord() > 255
> >in the the content of a HTTP::Request.  If you want UTF8 encoded stuff
> >you should put UTF8 encoded stuff in the content.  Don't expect perl
> >to magically guess.  You should use Encode::encode_utf8($str) or
> >something like it.
> >
> >If there was an easy way I would like to add a
> >
> >  sv_utf8_downgrade($req->content, 0);
> 
>    utf8::downgrade($req->content, 0);
> 
> for perl 5.7.* for large-enough *

I guess I could do something like

    utf8::downgrade($req->content, 0) if defined &utf8::downgrade;

then.  Might want to add this to the HTTP::Message->content() method
so it croaks as soon as you try to put wide characters in.

n> >
> >to the LWP::Protocol code.  This would make requests with such chars
> >in them fail early.  I think the write call on the socket ought to do
> >the downgrade and croaking for me though.
> 
> Again I think 5.7.* branch should do that - it was certainly the intent
> (it may only warn)

I just verified that syswrite does indeed croak in bleedperl.

This program:

--------------------------------------------------------------------
require LWP::UserAgent;
my $ua = LWP::UserAgent->new;

my $req = HTTP::Request->new(POST => 'http://localhost/test.cgi');
$req->content_type("text/plain");
$req->content(v200.300.400);

my $res = $ua->request($req);
print $res->as_string;
--------------------------------------------------------------------

prints:

500 (Internal Server Error) Wide character in syswrite
Client-Date: Thu, 06 Sep 2001 20:11:41 GMT

Thread Previous


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About