develooper Front page | perl.perl5.porters | Postings from November 2003

Re: [perl #24541] substr and utf8 and use bytes

Matt Sergeant
November 30, 2003 12:03
Re: [perl #24541] substr and utf8 and use bytes
Message ID:
On 23 Nov 2003, at 8:10, Gisle Aas wrote:

> William R Ward (via RT) <> writes:
>> We have a need to take a string containing utf8-encoded multibyte
>> characters, and then, treating the string as bytes, extract a
>> substring of N characters from it.
>> This is what "use bytes" was meant for, and it works great on Perl
>> 5.6.1.
> "use bytes" is evil.  It exposes internal implementation details that
> you are not supposed to know about and I'm not surprised the results
> differ between versions of perl.
> Just use Encode to clearly state your intents in a way that will work
> whatever internal representation of wide chars Perl might have.
> Something like this:
>   substr(encode_utf8($string), $m, $n);
> will do what you describe above.

Encode isn't available for 5.6.1 though.

Matt. Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About