develooper Front page | perl.perl5.porters | Postings from February 2022

Re: RFC: Rename the “UTF8” flag

Thread Previous | Thread Next
From:
Dan Book
Date:
February 1, 2022 18:59
Subject:
Re: RFC: Rename the “UTF8” flag
Message ID:
CABMkAVWWShbzU94t7fhxP+SzqD4=n4JEM0TRX6+xXxaRh0BjeA@mail.gmail.com
On Tue, Feb 1, 2022 at 1:20 PM Joseph Brenner <doomvox@gmail.com> wrote:

>         Felipe Gasper <felipe@felipegasper.com> wrote:
>
> > Then there’s utf8::is_utf8(), which, for pure-Perl code, usually means
> the *opposite* of what it looks like it means. THIS. IS. MADNESS. No one
> groks it all without investing *significant* effort.
>
> I can see how a global rename throughout the internals could be a lot
> of work, I was going to make the point that in my experience the place
> where the confusion hits client programmers is "is_utf8".  It gets
> used wrong a lot, to the point where when I see it used I'm not sure
> what I should think-- is this a code smell, or is this one of the few
> who really gets what it means (and I'm still not sure I do).
>
> Modest proposal:  add an alias to is_utf8 to something else, e.g.
> "is_heavy"  (I think I'd prefer "is_modern" but that's not without
> issues.)  Then encourage the use of the new form, and possible
> deprecate the old one.
>

This is a bit off topic but specifically on utf8::is_utf8:

I would prefer is_upgraded since that is the only consistent terminology
that has been used externally other than the misleading utf8 bit's name. I
don't quite get the objection to upgraded/downgraded as terms, and heavy
doesn't seem distinct enough terminology.

Note that upgraded strings are not "the new form" - both forms of strings
are still used in modern code. Downgraded string operations are more
efficient when usable, and byte strings should always be downgraded (but
should function correctly when upgraded as well).

-Dan

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About