Front page | perl.perl5.porters |
Postings from August 2001
oct() and hex()
Thread Next
From:
Nicholas Clark
Date:
August 31, 2001 15:14
Subject:
oct() and hex()
Message ID:
20010831231413.J4950@plum.flirble.org
With pp_divide and pp_modulo now preserving UVs when UVs are larger than the
NV mantissa, as far as I know the only 3 operators left that lose bits via
NVs are oct, hex and unpack "%64". For example:
perl -le 'printf "%x\n", $_ foreach (0x123456789abcdef, hex "123456789abcdef")'
123456789abcdef
123456789abcdf0
Underlying hex and oct are 3 functions, scan_bin, scan_oct, and scan_hex that
currently return their result in an NV. They are prototyped like this:
NV
Perl_scan_bin(pTHX_ char *start, STRLEN len, STRLEN *retlen)
NV
Perl_scan_oct(pTHX_ char *start, STRLEN len, STRLEN *retlen)
NV
Perl_scan_hex(pTHX_ char *start, STRLEN len, STRLEN *retlen)
and are currently virtually undocumented, apart from this cross reference
in perlclib.pod:
Notice also the C<scan_bin>, C<scan_hex>, and C<scan_oct> functions in
F<util.c> for converting strings representing numbers in the respective
bases into C<NV>s.
[patch at end, as they are now in numeric.c]
The calling convention is fairly self explanatory, except for retlen which
is expected to be 1 to allow underscores in the number, 0 to disallow.
Typical calling convention (eg regcomp.c) is:
numlen = 1; /* allow underscores */
ender = (UV)scan_hex(p + 1, e - p - 1, &numlen);
and actually all calls apart from pp_oct and pp_hex currently immediately cast
back to UV. Worth noting is that the (UV) cast is undefined behaviour for any
NV >= (UV_MAX + 1) or <= -1, which is relevant below.
The requirement seems to be for scanning functions that take a pointer,length
pair to define a region of memory to scan, flags (currently only underscores
(dis)?allowed) and return a number. Currently the number is returned as an NV.
I'm proposing to provide three new scanning functions (names?) to return the
result either as an NV or a UV. This will actually avoid the undefined
behaviour for pathologically long hex values on platforms where
sizeof(UV) >= sizeof(NV). scan_hex etc will be remain for binary
compatibility, and will be implemented will calls to the three new functions.
The current internal implementation actually starts off using UVs, and flips
over to NVs if the UVs overflow, casting the UV to NV on return if it has
not overflowed.
I'm proposing an API like this:
UV
Perl_grok_hex(pTHX_ char *start, STRLEN *len, I32 *flags, NV *result)
entry
start is the address to scan (as before)
*len is the length to scan (as before, but now passed as a pointer)
*flags are flags to affect the scan (currently only underscores)
result is a pointer to NV or NULL.
return
*len is the length of scanned string (currently retlen)
*flags are result flags (initially only result_overflowed_uv)
*result is the value, only if non_null and result_overflowed_uv is true
and the function returns the scanned number (if it did not overflow) or
UV_MAX if result_overflowed_uv is true.
[UV_MAX is equivalent to one possibility of what undefined behaviour of the (UV)
cast in regcomp.c returns. (SIGFPE is another...)]
the regcomp.c code becomes
numlen = e - p - 1;
flags = PERL_SCAN_ALLOW_UNDERSCORES;
ender = grok_hex(p + 1, &numlen, &flags, NULL);
which preserves the current semantics of not actually caring if the hex value
overflows a UV.
pp_hex becomes
PP(pp_hex)
{
dSP; dTARGET;
char *tmps;
I32 flags = PERL_SCAN_ALLOW_UNDERSCORES;
STRLEN len;
NV resultd;
UV resultu;
tmps = (SvPVx(POPs, len));
resultu = grok_hex (tmps, &len, &flags, &resultd);
if (flags & PERL_SCAN_GREATER_THAN_UV_MAX) {
XPUSHn(resultu);
}
else {
XPUSHn(resultd);
}
RETURN;
}
and backwards compatibility is via
NV
Perl_scan_hex(pTHX_ char *start, STRLEN len, STRLEN *retlen)
{
NV rnv;
I32 flags = *retlen ? PERL_SCAN_ALLOW_UNDERSCORES : 0;
UV ruv = grok_hex (start, &len, &flags, &rnv);
*retlen = len;
return (flags & PERL_SCAN_GREATER_THAN_UV_MAX) ? rnv : (NV)ruv;
}
[in worst Usenet tradition none of the above has been run through a compiler.]
Comments? Suggestions?
Nicholas Clark
--- pod/perlclib.pod.orig Tue Feb 13 02:30:12 2001
+++ pod/perlclib.pod Fri Aug 31 22:39:23 2001
@@ -166,7 +166,7 @@
strtoul(s, *p, n) Strtoul(s, *p, n)
Notice also the C<scan_bin>, C<scan_hex>, and C<scan_oct> functions in
-F<util.c> for converting strings representing numbers in the respective
+F<numeric.c> for converting strings representing numbers in the respective
bases into C<NV>s.
In theory C<Strtol> and C<Strtoul> may not be defined if the machine perl is
Thread Next
-
oct() and hex()
by Nicholas Clark