develooper Front page | perl.perl5.porters | Postings from January 2020

Re: =?utf-8?B?4oCcc3RyaWN04oCd?= strings?

Thread Previous | Thread Next
From:
Dave Mitchell
Date:
January 8, 2020 11:03
Subject:
Re: =?utf-8?B?4oCcc3RyaWN04oCd?= strings?
Message ID:
20200108110300.GN9181@iabyn.com
On Wed, Jan 08, 2020 at 11:00:52AM +0100, André Warnier (tomcat/perl) wrote:
> Is there any way in which a perl program, running as a stand-alone process
> on a Linux platform, calling some builtin or external function which is
> obviously meant to return a "text value", can /insure/ that this text value
> would come back utf8-encoded, with the utf8 flag set ?
> ("utf8-encoded" in this case meaning that a "è" would always be represented
> by 2 bytes in the text variable; and "insure" in this case meaning that I
> would not have to run a check each time I call this function (or another) in
> order to verify that it does not return a text value that does NOT have the
> utf8 flag set, but where my "è" IS represented by 2 bytes)
> After years of using perl5, I am still not clear about this..

Strings within perl are just sequences of codepoints, and how they are
stored internally is an implementation detail (e.g. the SVf_UTF8 flag).
What you are supposed to do is tell perl what encoding any data is which
is input or output, so perl can correctly convert to/from its own storage
format. So we provide things like 'perl -CSA' and
open($fh, '< :encoding(UTF-8)',...). And when you don't know in advance,
then we provide functions to manually encode/decode data like
utf8::decode($s) and the more general stuff in Encode.pm.

Is that what you meant?

-- 
Modern art:
    "That's easy, I could have done that!"
    "Ah, but you didn't!"

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About