develooper Front page | perl.perl5.porters | Postings from February 2017

Re: Proposal: Rename utf8::is_utf8() to utf8::is_upgraded()

Thread Previous | Thread Next
From:
pali
Date:
February 20, 2017 22:35
Subject:
Re: Proposal: Rename utf8::is_utf8() to utf8::is_upgraded()
Message ID:
20170220223511.GA26228@pali
On Tuesday 21 February 2017 09:46:06 Kent Fredric wrote:
> On 21 February 2017 at 01:55, Leon Timmermans <fawaka@gmail.com> wrote:
> > which doesn't fit in a function description.
> 
> I'd start by saying that this function has no bearing on whether the
> *data* in the scalar is actually utf8 encoded or not.
> 
> That's what most people are thinking I think, that this is a query
> about the *content* of the string, when that state
> is independent of the state of this flag.
> 
> As an analogy, its about as useful as poking in perl internals to see
> if a scalar is a PVIV  or not and assuming because its the string "0"
> and hte IV slot hasn't been filled yet, that its "not a number" ...
> which is useful, but not to people who are simply wanting to see if a
> value is safe for math or not.
> 
> As another analogy, utf8ness of strings is like signedness of ints in C.
> 
> If somebody unpacked 4 bytes of data into an unsigned-int when they
> should have unpacked it into a signed int, the language will treat the
> data wrong. *asking* "is it a signed int" doesn't reallly tell us
> anything except about the container. However, if you know there's 4
> bytes of data sitting around in an unsigned int which is really a
> signed int, you can locally say "ok, use signed int logic here"
> 
> So with that said:
> 
> 
>  *   "$flag = utf8::is_upgraded($string)"
> 
>          (Since Perl 5.28) Test whether $string 's internal bytes are marked
>          for interpretation via utf8 semantics or not. Note this bears
> no impact on whether that
>          data is actually utf8, only how perl functions such as
> "length" should treat its bytes.

What about this?

Test whether $string can internally store wide characters (Unicode code
points above U+0000FF). It does not say anything if $string already
contains such wide characters or not. You should not use this function
except you are dealing with broken XS modules.

> 
>  *   "$flag = utf8::is_utf8($string)"
>          (Since Perl 5.8.1) Compatibility-supporting (but poorly named) alias of
>          utf8::is_upgraded
> 
> 
> Its a bit wordy, but probably progress.
> 
> Though that said, I think we can find a clearer name than "is_upgraded"

Yes, if you (or anybody else) find better name let us know. Clear name
for such test function is really needed.

Anyway, what about having this function in Internals:: instead in utf8:: ?

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About