On 07/10/2017 02:13 PM, Zefram wrote: > Sawyer X wrote: >> Does anyone have any comments on this? Tony, Dave, Zefram? *Karl*? :) > > I didn't want to add to a mostly bikeshedding discussion, but OK. > I concur that the existing names are poor, but I'm not much happier with > the names that have been suggested on this thread. I reckon the best > terminology we have for this flag, at the user level, is "upgraded", > and so the name "is_utf8" would be better as "is_upgraded". The existing > names "upgrade" and "downgrade" for the transforming operations are OK, > and the only change I'd potentially like to make to them would be to add > something that explicates their rather unusual in-place side-effecting > nature. > > In fact you can see all my preferred names in my CPAN module > Scalar::String. This module essentially attempts to be the sane version > of utf8.pm, attempting to impart the right mental model through its > function names and documentation. (The "sclstr_" prefix on all the > function names may be omitted if desired; the important part of the name > is that which distinguishes these functions from each other.) > > I think the names for these functions should be reasonably concise, > and in particular we should have a single-word adjective for "having > the SvUTF8 flag on" if possible. We should also try to reuse existing > terminology, rather than invent anything new. We should also avoid any > term that implies anything beyond the storage, such as any reference to > characters or Unicode, because such implications are largely inaccurate, > and anywhere they are accurate is a bug. All of this leads me to prefer > "upgraded" over "utf8", "unicode", "uses_wide_storage", and the like. > > I don't have any strong opinion about which package any new names for > these functions should appear in. I think on balance we should not > remove the old names, because the trouble that arises from maintaining > them is small compared to the hassle that would arise from requiring > existing correct programs to change. Not removing them implies that > we wouldn't even be deprecating them, as currently defined, but we can > fairly discourage the use of the old names in documentation. > > -zefram > My view is that the current names could be improved, and that there should be no technical nor social problem in creating new names while retaining the old ones, but changing the docs to stress the new ones. I've done that a lot. I don't know what namespace is best. At first blush Internals seems good to me, for this and other things that people currently have hacks for, like $foo & "" which trying to find out if $foo is a string or just a number. I don't fully understand the objection to 'Internals' I have never liked upgrade and downgrade. When you upgrade something you are supposed to get something better, like more legroom. I have never seen why a PV is better than a number, or a UTF-8 string better than a non-one (it's far slower, for example, which is a downgrade in my estimation). The use of upgrade and downgrade is jargon based on the attitudes of the implementers, which should be avoided. Maybe it's too baked in to change, but I regret that it's there. UTF-8 itself is an implementation detail that should never have been exposed to the outside, but 'use utf8' pretty much does that.Thread Previous | Thread Next