develooper Front page | perl.perl5.porters | Postings from February 2022

Re: RFC: Rename the “UTF8” flag

Thread Previous | Thread Next
Arne Johannessen
February 1, 2022 15:10
Re: RFC: Rename the “UTF8” flag
Message ID:
Felipe Gasper wrote:
> Renaming this flag will achieve several benefits:
> 1. The mistaken belief that Perl uses UTF-8 internally will recede.

According to the Perl documentation (e. g. perlunifaq), Perl does use UTF-8 internally. If the documentation is wrong, let's fix that first.

I'm ignoring the difference between Perl's "lax" utf8 and "strict" UTF-8 here, as it's hardly relevant in practice. It is also well explained in the docs.

> 3. The term “UTF-8 string” will be less sensible to use in reference
> to Perl’s internals since Perl will provide an official replacement
> (“heavy string”). This will help to prevent confusion
> when discussing encoding and related matters; having the terms
> “heavy UTF-8 string”, “heavy Unicode string”, “non-heavy UTF-8 string”,
> and “non-heavy Unicode string” will clarify matters where
> the current ambiguity of “UTF-8 string” impedes communication.

I agree that the ambiguity of the term "UTF-8 string" is problematic.

However, if this proposal were to be implemented, I'd expect the term "UTF-8 string" would still be commonly used out of habit, and it would *still* be ambiguous. If people wished to express themselves clearly today, they could do so. But they usually don't. Switching one name for another doesn't address this problem.

> The following renames are proposed; in each case the old name
> should remain as an alias for the new (with appropriate indications
> in documentation):
> - `SVf_UTF8`        -> `SVf_HEAVY`
> [...]

XS code that wants to remain backwards-compatible with Perl v5.36 and earlier would have to keep using the old names anyway though, right?

Overall, I agree with Dave Mitchell that this proposal seems to just increase the confusion, rather than reduce it.

Arne Johannessen

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About