develooper Front page | perl.perl5.porters | Postings from July 2017

Re: [perl #131685] Rename utf8::is_utf8() (and other functions)

Thread Previous | Thread Next
Karl Williamson
July 11, 2017 02:53
Re: [perl #131685] Rename utf8::is_utf8() (and other functions)
Message ID:
On 07/10/2017 02:13 PM, Zefram wrote:
> Sawyer X wrote:
>> Does anyone have any comments on this? Tony, Dave, Zefram? *Karl*? :)
> I didn't want to add to a mostly bikeshedding discussion, but OK.
> I concur that the existing names are poor, but I'm not much happier with
> the names that have been suggested on this thread.  I reckon the best
> terminology we have for this flag, at the user level, is "upgraded",
> and so the name "is_utf8" would be better as "is_upgraded".  The existing
> names "upgrade" and "downgrade" for the transforming operations are OK,
> and the only change I'd potentially like to make to them would be to add
> something that explicates their rather unusual in-place side-effecting
> nature.
> In fact you can see all my preferred names in my CPAN module
> Scalar::String.  This module essentially attempts to be the sane version
> of, attempting to impart the right mental model through its
> function names and documentation.  (The "sclstr_" prefix on all the
> function names may be omitted if desired; the important part of the name
> is that which distinguishes these functions from each other.)
> I think the names for these functions should be reasonably concise,
> and in particular we should have a single-word adjective for "having
> the SvUTF8 flag on" if possible.  We should also try to reuse existing
> terminology, rather than invent anything new.  We should also avoid any
> term that implies anything beyond the storage, such as any reference to
> characters or Unicode, because such implications are largely inaccurate,
> and anywhere they are accurate is a bug.  All of this leads me to prefer
> "upgraded" over "utf8", "unicode", "uses_wide_storage", and the like.
> I don't have any strong opinion about which package any new names for
> these functions should appear in.  I think on balance we should not
> remove the old names, because the trouble that arises from maintaining
> them is small compared to the hassle that would arise from requiring
> existing correct programs to change.  Not removing them implies that
> we wouldn't even be deprecating them, as currently defined, but we can
> fairly discourage the use of the old names in documentation.
> -zefram

My view is that the current names could be improved, and that there 
should be no technical nor social problem in creating new names while 
retaining the old ones, but changing the docs to stress the new ones. 
I've done that a lot.

I don't know what namespace is best.  At first blush Internals seems 
good to me, for this and other things that people currently have hacks 
for, like

	$foo & ""

which trying to find out if $foo is a string or just a number.  I don't 
fully understand the objection to 'Internals'

I have never liked upgrade and downgrade.  When you upgrade something 
you are supposed to get something better, like more legroom.  I have 
never seen why a PV is better than a number, or a UTF-8 string better 
than a non-one (it's far slower, for example, which is a downgrade in my 
estimation).  The use of upgrade and downgrade is jargon based on the 
attitudes of the implementers, which should be avoided.  Maybe it's too 
baked in to change, but I regret that it's there.  UTF-8 itself is an 
implementation detail that should never have been exposed to the 
outside, but 'use utf8' pretty much does that.

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About