develooper Front page | perl.perl5.porters | Postings from May 2013

Proposal: initial type annotation in SV

Thread Next
From:
Sébastien Aperghis-Tramoni
Date:
May 5, 2013 21:45
Subject:
Proposal: initial type annotation in SV
Message ID:
4370841D-031B-4758-B283-5212CC73F189@free.fr
Hello fellow porters,


I would like to expose an idea I had to try solving (at least partly) the problem of serializing scalar values from Perl. But first, allow me to expose this in more details. I apologize in advance if I employ imprecise or incorrect terminology, my understandings of Perl guts is far from being solid.

The root of the problem is that Perl use the same object, SV, for three basic types: integer, float and string. It's a major feature of Perl, which makes this perfectly transparent as long as we stay in the Perl world. But when there is the need to communicate with the outer worlds through serialized formats, troubles begin to appear because the inner type can change in order to accommodate internals needs:

$ perl -MDevel::Peek -e 'Dump $v=42; $s="$v"; Dump $v'
SV = IV(0x7ff699827590) at 0x7ff6998275a0
  REFCNT = 1
  FLAGS = (IOK,pIOK)
  IV = 42
SV = PVIV(0x7ff699809208) at 0x7ff6998275a0
  REFCNT = 1
  FLAGS = (IOK,POK,pIOK,pPOK)
  IV = 42
  PV = 0x109509fd0 "42"\0
  CUR = 2
  LEN = 16

$v, which was first an IV, becomes a PVIV because it was interpolated into a string.
Okay, just a Perl internal thing. Not a problem.
Till we want to serialize the value:

$ perl -MJSON::XS -E '$v=42; say encode_json {v=>$v}; $s="$v"; say encode_json {v=>$v}'
{"v":42}
{"v":"42"}

The integer value has been converted to a string. Perl may not care, but JavaScript does. As well as many other languages. Same thing with other serializers, even the most recent one:

$ perl -MDevel::Peek -MSereal -e '$v=42; $s="$v"; Dump $v; $x = decode_sereal(encode_sereal(\$v)); Dump $$x'
SV = PVIV(0x7ff261009208) at 0x7ff261027648
  REFCNT = 1
  FLAGS = (IOK,POK,pIOK,pPOK)
  IV = 42
  PV = 0x10d002370 "42"\0
  CUR = 2
  LEN = 16
SV = PV(0x7ff2610011b0) at 0x7ff261026d60
  REFCNT = 1
  FLAGS = (POK,pPOK)
  PV = 0x10d029c30 "42"\0
  CUR = 2
  LEN = 16

Storable has exactly the same behavior:

$ perl -MDevel::Peek -MStorable=freeze,thaw -e '$v=42; $s="$v"; Dump $v; $x = thaw(freeze(\$v)); Dump $$x'
SV = PVIV(0x7ff599009208) at 0x7ff599027648
  REFCNT = 1
  FLAGS = (IOK,POK,pIOK,pPOK)
  IV = 42
  PV = 0x107502380 "42"\0
  CUR = 2
  LEN = 16
SV = PV(0x7ff5990010b0) at 0x7ff599026e20
  REFCNT = 1
  FLAGS = (POK,pPOK)
  PV = 0x1075017a0 "42"\0
  CUR = 2
  LEN = 16

Because of this, some serializers resort to sniffing the value with regexps (for example, XML::RPC, XMLRPC::Lite), which resolves part of the problem, but introduces some side effects.


So, here is a proposal to try making this situation a bit better: annotate the SV to record the initial (or canonical) type of the the value, that is the type of the value that was last affected to the SV.

This solves nothing by itself, but provides to the modules which needs this information (typically, the serializers) a way to know the "true" type of the value. So, instead of testing

    SvFLAGS(sv) & SVf_IOK  # Storable
    SvIOKp(sv)             # JSON::XS, Sereal::Encoder

they could use a macro like: SvINITIALTYPE(sv) == SVit_NUM


Now, did I miss something so blindingly obvious that I didn't see why it cannot work? is it silly?

Thanks in advance for your comments and critics.

In the mean time, I'll try to write a proof of concept.


-- 
Sébastien Aperghis-Tramoni

Close the world, txEn eht nepO.


Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About