develooper Front page | perl.perl5.porters | Postings from February 2015

[perl #123715] Re: [ #101876] losing string value of semi-numeric string

Thread Previous | Thread Next
yves orton
February 2, 2015 10:52
[perl #123715] Re: [ #101876] losing string value of semi-numeric string
Message ID:
# New Ticket Created by  yves orton 
# Please include the string:  [perl #123715]
# in the subject line of all future correspondence about this issue. 
# <URL: >

On 2 February 2015 at 11:33, Zefram via RT
<> wrote:
> Mon Feb 02 05:33:20 2015: Request 101876 was acted upon.
> Transaction: Ticket created by
>        Queue: Sereal-Encoder
>      Subject: losing string value of semi-numeric string
>    Broken in: (no value)
>     Severity: (no value)
>        Owner: Nobody
>   Requestors:
>       Status: new
>  Ticket <URL: >
> $ perl -MSereal::Encoder=encode_sereal -MSereal::Decoder=decode_sereal -lwe 'print $]; print $Sereal::Encoder::VERSION; my $a="0 but true"; print decode_sereal(encode_sereal($a)); my $b = $a+0; print $a; print decode_sereal(encode_sereal($a));'
> 5.018002
> 3.005
> 0 but true
> 0 but true
> 0
> I believe the first encoding is representing $a as a string but the
> second encoding is representing it as a pure integer, based on the IOK
> flag.  In the case of this string, along with infinitely many others
> such as "00", "01", and "1 ", the integer representation is lossy.
> It's particularly significant for strings such as "0 but true" and "00"
> which qualify as true but come out as false when mangled by the lossy
> encoding.  But even when the truth value doesn't change, it is not at
> all acceptable to lose the string value.
> The underlying mistake is that you've treated the IOK flag as implying
> that the scalar is fully characterised by its IV.  In general that is
> not the case.  For scalars that are both IOK and POK, to see whether
> integer representation suffices you need to perform the IV->PV coercion
> yourself, and see whether the PV generated from the IV matches the
> scalar's actual PV.  Similar remarks apply to NOK and NV.  For extra fun,
> the exact meaning of the [PIN]OK flags varies between Perl versions.

No. I disagree. This is a bug in perl itself.

$ perl -MDevel::Peek -le'my $x="0 but true"; my $y=0+$x; Dump($x)'
SV = PVIV(0x7cdd88) at 0x7d9a48
  REFCNT = 1
  IV = 0
  PV = 0x7d2b90 "0 but true"\0
  CUR = 10
  LEN = 16

The IOK flag should NOT be set here, it should be pIOK only.

IOK means that the integer representation is either a) canonical, or
b) a faithful representation of the PV.

pIOK is supposed to mean that the cached value of the string can be
used, but that it is not a faithful representation of the string it
was derived from.

(If IOK and pIOK do not mean these things then it is a total waste to
have both set of flags, which seems an unreasonable interpretation.)

Compare to this:

$ perl -MDevel::Peek -le'my $x="0blahblah"; my $y=0+$x; Dump($x)'
SV = PVNV(0x1bcaf10) at 0x1beaa58
  REFCNT = 1
  IV = 0
  NV = 0
  PV = 0x1be3ba0 "0blahblah"\0
  CUR = 9

IMO this is clearly a bug in the special case logic for "0 but true".
It should NOT set the IOK flag, it should set only the pIOK flag.

I will naturally try to fix this in Sereal, but I consider this a bug
in Perl and I am sending this to perlbug because of it.


perl -Mre=debug -e "/just|another|perl|hacker/"

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About