develooper Front page | perl.perl5.porters | Postings from July 2017

[perl #131685] Rename utf8::is_utf8() (and other functions)

Thread Previous
July 1, 2017 16:03
[perl #131685] Rename utf8::is_utf8() (and other functions)
Message ID:
# New Ticket Created by   
# Please include the string:  [perl #131685]
# in the subject line of all future correspondence about this issue. 
# <URL: >


This is continuation from original discussion about renaming 
utf8::is_utf8() to utf8::is_upgraded() which can be found at:

Problem is that in more perl modules is used this incorrect code 

  use utf8;

  my $value = func();
  if (utf8::is_utf8($value)) {

In most cases module developers think that utf8::is_utf8() returns true 
when it is needed to manually encode argument into UTF-8 bytes. Which is 
of course wrong.

Reason for this is poor name of function utf8::is_utf8() and also poor 
documentation about this function.

Functions utf8::is_utf8(), utf8::upgrade() and utf8::downgrade() changes 
internal string representation, which is fully invisible for pure perl 
code, and therefore I think all those functions should be in Internals 

I'm proposing following rename of functions:

utf8::is_utf8() --> Internals::uses_string_wide_storage()
utf8::upgrade() --> Internals::upgrade_string_to_wide_storage()
utf8::downgrade() --> Internals::downgrade_string_from_wide_storage()

Plus adding backward compatible aliases to make existing code works like 

As all those functions should be used only for debugging purposes (e.g. 
test cases for XS code) or when dealing with buggy XS module, I'm 
proposing starting to throw warning (e.g. since v5.28.0) when those 
functions are called. For those who are dealing with internals, can turn 
warning off by no warnings 'experimental::internal';

I'm attaching patches which:

* Add new warning category 'experimental::internal'
* Rename utf8 functions
* Update perldoc utf8 documentation

Thread Previous Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About