develooper Front page | perl.perl5.porters | Postings from January 2018

[perl #132782] Missing SvPV* utf8/byte nomg macro variants

Thread Previous
From:
perlbug-followup
Date:
January 29, 2018 15:30
Subject:
[perl #132782] Missing SvPV* utf8/byte nomg macro variants
Message ID:
rt-4.0.24-16571-1517239836-897.132782-75-0@perl.org
# New Ticket Created by   
# Please include the string:  [perl #132782]
# in the subject line of all future correspondence about this issue. 
# <URL: https://rt.perl.org/Ticket/Display.html?id=132782 >


Hi! Currently in perl there are missing SvPVutf8_nomg and SvPVbyte_nomg
macros, equivalent of SvPVutf8 and SvPVbyte, but without processing get
magic. To write XS module correctly without being affected by Perl's The
Unicode Bug, it is easier to use SvPVutf8 resp. SvPVbyte macros instead
of combination of SvPV + SvUTF8 with manual converting Latin1 to utf8.
But if it is needed to distinguish between undef and string in function
implemented in XS, then it SvPVutf8 cannot be used as it throw warning
in case scalar is undef. I think that it is common requirement to
support API undef or string, therefore SvPVutf8_nomg would be really
useful.

Currently for API which accepts undef or string is required something
like this:

void
function(arg)
  SV *arg
PREINIT:
  SV *tmp;
  char *str;
  STRLEN len;
INIT:
  SvGETMAGIC(arg);
CODE:
  if (SvOK(arg)) {
    str = SvPV_nomg(arg, len);
    if (!SvUTF8(arg) {
      if (SvGMAGICAL(arg))
        tmp = sv_2mortal(newSVpvn(str, len));
      else
        tmp = arg;
      str = SvPVutf8(tmp, len);
    }
  } else {
    str = NULL;
    len = 0;
  }
... now str/len contains either NULL or utf8 representation of arg ...

Which is really non-intuitive and hard to write from scratch for novice
as there is fully missing such (very common) example in any perl
documentation.

With SvPVutf8_nomg it would reduce code just to:

void
function(arg)
  SV *arg
PREINIT:
  char *str;
  STRLEN len;
INIT:
  SvGETMAGIC(arg);
CODE:
  if (SvOK(arg)) {
    str = SvPVutf8_nomg(arg, len);
  } else {
    str = NULL;
    len = 0;
  }
... now str/len contains either NULL or utf8 representation of arg ...

Maybe some SvPV* macro which would return NULL without warning for
undefined value can be useful too to even more simplify that code.

Also, perlapi documentation should suggest to use SvPVutf8 (reps.
SvPVbyte) function instead of SvPV as without processing SvUTF8() check,
such code is affected by the Perl's Unicode Bug.

Also, to prevent processing get magic more times, it is needed to call
get magic only once in XS function, so ideally with SvGETMAGIC() and
then using only *_nomg functions/macros.


Thread Previous


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About