Front page | perl.perl5.porters |
Postings from November 2015
Scalar "type" flags, public and private (POK/IOK/NOK)
Thread Next
From:
demerphq
Date:
November 30, 2015 16:12
Subject:
Scalar "type" flags, public and private (POK/IOK/NOK)
Message ID:
CANgJU+V5sL3upT_qyoXGj406Dfdh5MuXXu7x6fXjOxbGDWh45g@mail.gmail.com
Perl as a general rule shies away from the question of "what type does
a scalar have".
This means that most core code should not try to determine the type of
an SV argument, it should instead simply try to coerce it to the type
required, and let the coercion logic handle emitting any warnings or
errors should the coercion be "controversial". (IOW, coercing "foo" to
an IV returns 0, and produces a not-numeric warning.)
Overall this strategy this strategy works well, and we probably should
not mess with it too much.
On the other hand, the way this strategy is implemented is somewhat
maddening in those cases where one really does care. There basically
is no way to tell if a SCALAR that is IOK and POK was originally a
string or originally a number. In many cases this is not such a big
deal, but in some specialized cases, serialization for instance, this
can be a very big deal, possibly resulting in data loss.
The current state of play is that there are 6 bit flags controling
type (I gloss over some flags which I consider less problematic).
These flags are the "public" SVf_POK, SVf_IOK and SVf_NOK, along with
their "private" equivalents SVp_POK, SVp_IOK, SVp_NOK, which represent
"contains a string value", "contains an integer value" and "contains a
float value" respectively.
Currently the meaning of these flags is poorly documented and IMO
ambiguous. It is not clear what the difference between these flags are
exactly. What is clear is that the "f" variants are never set unless
the "p" variants are also set, and that the macros which check if they
are set typically check both, and return true if *either* is true.
A superficial examination of the code and the behavior of these flags
would easily leads someone to the conclusion that 'SVp_IOK is set
when a string is converted to a number, and SVf_IOK is set if that
conversion was "canonical"'. Ditto for NOK.
Unfortunately this is just wrong, and a closer examination reveals
many cases where it is not true.
A simple case to show it being wrong is the following:
perl -MDevel::Peek -e'$x= " 1 "; $y= $x+1; Dump($x)'
SV = PVIV(0xec3970) at 0xec9340
REFCNT = 1
FLAGS = (IOK,POK,pIOK,pPOK)
IV = 1
PV = 0xeb8740 " 1 "\0
CUR = 10
LEN = 16
Perl ignores leading and trailing whitespace when "nummifying" a
string. As a result such a string when converted to an integer leaves
the var IOK, other examples are "0e0", and "00", and similar.
$ perl -MDevel::Peek -e'$x= "00"; $y= $x + 0.0; Dump($x)'
SV = PVIV(0x1570970) at 0x1576340
REFCNT = 1
FLAGS = (IOK,POK,pIOK,pPOK)
IV = 0
PV = 0x1565740 "00"\0
CUR = 2
LEN = 16
$ perl -MDevel::Peek -e'$x= "0e0"; $y= $x + 0.0; Dump($x)'
SV = PVNV(0xb77230) at 0xba1340
REFCNT = 1
FLAGS = (IOK,NOK,POK,pIOK,pNOK,pPOK)
IV = 0
NV = 0
PV = 0xb90740 "0e0"\0
CUR = 3
LEN = 16
$ perl -MDevel::Peek -e'$x= "0 but true"; $y= $x + 0.0; Dump($x)'
SV = PVIV(0x17bb980) at 0x17c1350
REFCNT = 1
FLAGS = (IOK,POK,pIOK,pPOK)
IV = 0
PV = 0x17b0750 "0 but true"\0
CUR = 10
LEN = 16
What I think is both possible and desirable is to change how this
works so that, SVf_IOK is *only* set when the Scalar got its number
from an SvIV_set(), or when it is the result of a canonical conversion
from a string. Leading whitespace would leave the scalar as SVp_IOK,
but NOT SVf_IOK. Similarly SVp_IOK would be set for "0 but true", but
not SVf_IOK. (Many conversions do work like this, just not all of
them.)
The rule for whether a scalar has a "usable IV" field would be to
check if the SV was SVp_IOK. The rule to check whether a scalar was a
"proper IV" would be to check if it is SVf_IOK.
This logic would also apply for NV's.
So, am I missing some deep reason why we cant sanitize these flags?
The meaning of the "private" flags are essentially undocumented, and
they are almost always set according to the rules above anyway, so do
we think it would break stuff to sanitize this stuff?
Thanks for your help,
cheers,
Yves
--
perl -Mre=debug -e "/just|another|perl|hacker/"
Thread Next
-
Scalar "type" flags, public and private (POK/IOK/NOK)
by demerphq