develooper Front page | perl.perl5.porters | Postings from April 2008

Re: on the almost impossibility to write correct XS modules

Rafael Garcia-Suarez
April 26, 2008 08:33
Re: on the almost impossibility to write correct XS modules
Message ID:
2008/4/26 Marc Lehmann <>:
>  Some modules started to use different typemap entries to work around this
>  issue, for example:
>    void LOG (utf8_string msg)
>            $var = SvPVbyte_nolen ($arg)
>    T_UTF8 // == utf8_string
>            $var = SvPVutf8_nolen ($arg)
>  Unfortunately, unlike other, similar, functions (like SvIV, SvPV etc.), this
>  easily destroys the scalar value:
>    LOG ("see this object:");
>    LOG ($obj);
>    # $obj no longer an object here, it became a string

You mean, like "Class=HASH(0xDEADBEEF)", I suppose (I haven't checked).

>  So unlike other accessor functions such as SvPV, SvPVutf8 changes the
>  contents of the SV in a very visible way (while SvIV doesn't destroy the
>  string, for example).
>  I can understand why it does so, but the problem is, there is simply no good
>  way to deal with utf-8 in XS as the API is extremely hostile at the moment.
>  To get it right, I think one has to do something like this (this can be
>  optimised of course, but that makes it even more complicated):
>    T_UTF8
>            $var = SvPVutf8_nolen (sv_mortalcopy ($arg))
>  I think the situation with unicode and cpan perl modules cannot improve
>  as long as it so difficult to do somethign as simple as get at the string
>  data in a non-random/godgiven encoding.

That's right, and that's probably also why people find it difficult to
handle utf-8 in perl as soon as they begin using XS modules.

I think that this screams for a new macro which would be more or less
the one you suggested here, maybe implemented in a more efficient way if

>  Also, even though it is 5.10 now, it should be *seriously* considered to
>  replace the almost completely useless char * typemap entry by something
>  that gives you octets (preferably non-destructively). Or somebody explain
>  to me when "char *" does something useful in current perl versions without
>  tinkering with retesting ST(x) manually...

You mean this one ?
	    $var = ($type)SvPV_nolen($arg)
Here, the result would be dependent on the internal representation of
the string in perl, so I suppose you would like to change this to
something that uses SvPVbyte. I wonder, however, how much code would
break with that change. Actually I suspect that more code would be fixed
than breaked, but that's a wild intuition... Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About