develooper Front page | perl.perl5.porters | Postings from October 2016

Re: Encode.xs and sv_force_normal

Thread Previous | Thread Next
From:
pali
Date:
October 23, 2016 19:43
Subject:
Re: Encode.xs and sv_force_normal
Message ID:
201610232142.41856@pali
On Sunday 23 October 2016 20:36:31 Father Chrysostomos wrote:
> On Oct 22, 2016, at 4:16 AM, pali@cpan.org wrote:
> > On Monday 17 October 2016 23:31:31 pali@cpan.org wrote:
> > 
> > Encode::_utf8_on(sv)
> > 
> >> @@ -972,12 +982,11 @@ SV *	sv
> >> CODE:
> >> {
> >> 
> >>     if (SvPOK(sv)) {
> >> 
> >> -    SV *rsv = newSViv(SvUTF8(sv));
> >> -    RETVAL = rsv;
> >> -    if (SvIsCOW(sv)) sv_force_normal(sv);
> >> -    SvUTF8_on(sv);
> >> +        if (SvTHINKFIRST(sv)) sv_force_normal(sv);
> >> +        RETVAL = newSViv(SvUTF8(sv));
> >> +        SvUTF8_on(sv);
> >> 
> >>     } else {
> >> 
> >> -    RETVAL = &PL_sv_undef;
> >> +        RETVAL = &PL_sv_undef;
> >> 
> >>     }
> >> 
> >> }
> > 
> >> OUTPUT:
> > Encode::_utf8_off(sv)
> > 
> >> @@ -989,12 +998,11 @@ SV *	sv
> >> CODE:
> >> {
> >> 
> >>     if (SvPOK(sv)) {
> >> 
> >> -    SV *rsv = newSViv(SvUTF8(sv));
> >> -    RETVAL = rsv;
> >> -    if (SvIsCOW(sv)) sv_force_normal(sv);
> >> -    SvUTF8_off(sv);
> >> +        if (SvTHINKFIRST(sv)) sv_force_normal(sv);
> >> +        RETVAL = newSViv(SvUTF8(sv));
> >> +        SvUTF8_off(sv);
> >> 
> >>     } else {
> >> 
> >> -    RETVAL = &PL_sv_undef;
> >> +        RETVAL = &PL_sv_undef;
> >> 
> >>     }
> >> 
> >> }
> > 
> >> OUTPUT:
> > I'm not sure if these two functions shouldn't call also
> > SvGETMAGIC(sv) before SvPOK(sv) and also SvSETMAGIC(sv) after
> > SvUTF8_off/on(sv) calls.
> 
> It depends on how they are intended to be used.  These are low-level
> functions for tinkering with the internal encoding of a string
> (certainly not for general use; in fact, I would argue that their
> main purpose is to work around bugs in XS code that does not handle
> utf8 properly).

Personally I'm using these functions in unit tests (prepare scalar with 
some value and after unit function test that SvUTF8 is set/unset) and in 
buggy XS modules which read char* via SvPV(sv) process it and then 
returns new scalar from processed char* (which means they forgot about 
SvUTF8 flag.

> I am not sure it makes sense to use these on tied variables; after
> all, the user has to know what it is in the SV to begin with for any
> use of Encode::_utf8_off to make sense, but a FETCH call
> (SvGETMAGIC) might actually change whether the string is flagged as
> utf8.

Both function are prefixed with underline (_utf8_off and _utf8_on) which 
can be understood as private or special functions not for basic usage. 
And currently they does not work on magic scalars (correctly).

> That said, if someone does it on $/ then I would expect it just to
> work, so SvGETMAGIC/SvSETMAGIC seem appropriate.
> 
> It may be worth adding a caveat to the documentation that they do not
> make much sense on tied variables, unless you think that the
> existing caveats are sufficient.

So question is... what should these two functions do? Just change SvUTF8 
flag without calling any magic? Or should work also on magic variables 
like $/ ?

For me it makes sense to call SvGETMAGIC/SvSETMAGIC so $/ will be 
supported too...

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About