develooper Front page | perl.perl5.porters | Postings from May 2008

Re: on the almost impossibility to write correct XS modules

Thread Previous | Thread Next
From:
Juerd Waalboer
Date:
May 21, 2008 09:07
Subject:
Re: on the almost impossibility to write correct XS modules
Message ID:
20080521160648.GC2929@c4.convolution.nl
Glenn Linderman skribis 2008-05-21  8:50 (-0700):
> On approximately 5/21/2008 1:29 AM, came the following characters from 
> the keyboard of Rafael Garcia-Suarez:
> >Some way to mark PVs as "binary" and not upgradeable to SvUTF8 would be
> >handy, though.
> What's the goal?

Dual:

1. To provide a means of indicating that something is binary rather than
text. This can be useful in an encoding capable DBI drivers/wrappers for
example, to indicate that a "?" placeholder is already binary, and
should not be text-encoded. (You'd want to do this based on column
introspection but that's very slow and very hard to write portably.)
Another use case involves data serialization for exchange with languages
that have native binary strings, like Java.

2. To prevent programming errors; you should see this as a matter of
strictures. Most silly mistakes made in Unicode programming are related
to people who fail to understand the difference between binary and text
strings, and as a result from that, they sometimes add text strings to
binary strings. While conceptually that's always a mistake, it happens
so often and it's such an easy mistake te make (apparently) that it
would be nice to have language support that changes "upgrade entire
string to SvUTF8" to "add only the new portion as UTF8 (encoded, not
SvUTF8 marked), keep the original as it is"

> If the goal is to prevent the cost of upgrading and downgrading, well, 
> just fix the bug that attached the upgraded data... and the cost of 
> doing so also vanishes.

Detecting upgrades is hard. There's a module (encoding::warnings) that
enables warnings for it globally, but you often want it on a single
string instead. Indeed the bug where characters >255 are added to the
binary string should be fixed, but finding out where/when that happens
can be a lot of work and currently requires knowledge of internals.
-- 
Met vriendelijke groet,  Kind regards,  Korajn salutojn,

  Juerd Waalboer:  Perl hacker  <#####@juerd.nl>  <http://juerd.nl/sig>
  Convolution:     ICT solutions and consultancy <sales@convolution.nl>
1;

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About