develooper Front page | perl.perl5.porters | Postings from November 2003

Re: [perl #24541] substr and utf8 and use bytes

Thread Previous | Thread Next
Gisle Aas
November 23, 2003 00:11
Re: [perl #24541] substr and utf8 and use bytes
Message ID:
William R Ward (via RT) <> writes:

> We have a need to take a string containing utf8-encoded multibyte
> characters, and then, treating the string as bytes, extract a
> substring of N characters from it.
> This is what "use bytes" was meant for, and it works great on Perl
> 5.6.1. 

"use bytes" is evil.  It exposes internal implementation details that
you are not supposed to know about and I'm not surprised the results
differ between versions of perl.

Just use Encode to clearly state your intents in a way that will work
whatever internal representation of wide chars Perl might have.
Something like this:

  substr(encode_utf8($string), $m, $n);

will do what you describe above.


Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About