develooper Front page | perl.perl5.porters | Postings from March 2021

Re: Perl 7: Fix string leaks?

Thread Previous | Thread Next
From:
Dan Book
Date:
March 30, 2021 07:20
Subject:
Re: Perl 7: Fix string leaks?
Message ID:
CABMkAVWQN8fA266knBHoexKObmSFpzWD3FgZQAwDY1OU27fFbw@mail.gmail.com
On Tue, Mar 30, 2021 at 3:07 AM Yuki Kimoto <kimoto.yuki@gmail.com> wrote:

> I still don't understand this problem.
>
> Is 128-255(latin-1 and UTF-8 shared) range problem?
>

More specifically the problem is that those codepoints are both used by
encodings such as latin-1 and UTF-8 bytes, and also are valid unicode
codepoints (which are mostly the same as the characters they encode as
latin-1 bytes). This, with Perl's string design, means the program cannot
know whether a string intends to contain decoded text or encoded bytes.


>
> Ideal world is what? There is only Unicode in Perl internal?
>

Possibly, but the upgraded string format is slower at text operations. So
for programs that need to be fast and only work in ASCII, the downgraded
string format is still important.

Really the ideal world is that Perl has marked byte strings and character
strings, so the user knows which one they have and the program can die if
the user passes the wrong kind. But it is difficult to get there; Python 3
was meant to solve a similar problem.

-Dan

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About