develooper Front page | perl.perl5.porters | Postings from December 2000

Re: pp_add -> pp_i_add efficiency hack?

Thread Previous | Thread Next
From:
Nicholas Clark
Date:
December 3, 2000 13:43
Subject:
Re: pp_add -> pp_i_add efficiency hack?
Message ID:
20001203214346.A79486@plum.flirble.org
On Sun, Dec 03, 2000 at 03:23:24PM -0600, Jarkko Hietaniemi wrote:
> Could you perchance produce a combined patch?

Hopefully this is clean against 7953, although I was working on edited
7853.

I didn't include the pp.c diffs for integer pp_multiply as I didn't have
them enabled for testing at any point. They may not even compile for all
I know :-)

Also, it's your pp_add overflow detection algorithm, not that one I suggested
(partially implemented, tested and then wiped with a mis-timed rsync)
But I'll try mine at some point soon, and try benchmarking.

Nicholas Clark

--- sv.c.orig	Wed Nov 22 23:58:27 2000
+++ sv.c	Sun Dec  3 19:39:54 2000
@@ -1456,16 +1456,193 @@
 		    "Argument \"%s\" isn't numeric", tmpbuf);
 }
 
-/* the number can be converted to integer with atol() or atoll() */
-#define IS_NUMBER_TO_INT_BY_ATOL 0x01
-#define IS_NUMBER_TO_INT_BY_ATOF 0x02 /* atol() may be != atof() */
-#define IS_NUMBER_NOT_IV	 0x04 /* (IV)atof() may be != atof() */
-#define IS_NUMBER_NEG		 0x08 /* not good to cache UV */
-#define IS_NUMBER_INFINITY	 0x10 /* this is big */
+/* the number can be converted to integer with atol() or atoll() although */
+#define IS_NUMBER_TO_INT_BY_ATOL     0x01 /* it may exceed IV_MAX */
+#define IS_NUMBER_TO_INT_BY_STRTOL   0x02 /* seen something like 123e4 */
+#define IS_NUMBER_TO_INT_BY_ATOF     0x04 /* seen something like 123e4 */
+#define IS_NUMBER_LONGER_THAN_IV_MAX 0x08 /* more decimal digits than IV_MAX */
+#define IS_NUMBER_AS_LONG_AS_IV_MAX  0x10 /* may(be not) larger than IV_MAX */
+#define IS_NUMBER_NOT_INT	     0x20 /* seen a decimal point */
+#define IS_NUMBER_NEG		     0x40 /* seen a leading - */
+#define IS_NUMBER_INFINITY	     0x80 /* /^\s*-?Infinity\s*$/i */
 
 /* Actually, ISO C leaves conversion of UV to IV undefined, but
    until proven guilty, assume that things are not that bad... */
 
+/* As 64 bit platforms often have an NV that doesn't preserve all bits of
+   an IV (an assumption perl has been based on to date) it becomes necessary
+   to remove the assumption that the NV always carries enough precision to
+   recreate the IV whenever needed, and that the NV is the canonical form.
+   Instead, IV/UV and NV need to be given equal rights. So as to not lose
+   precision as an side effect of conversion (which would lead to insanity
+   and the dragon(s) in t/op/numconvert.t getting very angry) the intent is
+   1) to distinguish between IV/UV/NV slots that have cached a valid
+      conversion where precision was lost and IV/UV/NV slots that have a
+      valid conversion which has lost no precision
+   2) to ensure that if a numeric conversion to one form is request that
+      would lose precision, the precise conversion (or differently
+      imprecise conversion) is also performed and cached, to prevent
+      requests for different numeric formats on the same SV causing
+      lossy conversion chains. (lossless conversion chains are perfectly
+      acceptable (still))
+
+
+   flags are used:
+   SvIOKp is true if the IV slot contains a valid value
+   SvIOK  is true only if the IV value is accurate (UV if SvIOK_UV true)
+   SvNOKp is true if the NV slot contains a valid value
+   SvNOK  is true only if the NV value is accurate
+
+   so
+   while converting from PV to NV check to see if converting that NV to an
+   IV(or UV) would lose accuracy over a direct conversion from PV to
+   IV(or UV). If it would, cache both conversions, return NV, but mark
+   SV as IOK NOKp (ie not NOK).
+
+   while converting from PV to IV check to see if converting that IV to an
+   NV would lose accuracy over a direct conversion from PV to NV. If it
+   would, cache both conversions, flag similarly.
+
+   Before, the SV value "3.2" could become NV=3.2 IV=3 NOK, IOK quite
+   correctly because if IV & NV were set NV *always* overruled.
+   Now, "3.2" will become NV=3.2 IV=3 NOK, IOKp, because the flags meaning
+   changes - now IV and NV together means that the two are interchangeable
+   SvIVX == (IV) SvNVX && SvNVX == (NV) SvIVX;
+   
+   The benefit of this is operations such as pp_add know that if SvIOK is
+   true for both left and right operands, then integer addition can be
+   used instead of floating point. (for cases where the result won't
+   overflow) Before, floating point was always used, which could lead to
+   loss of precision compared with integer addition.
+
+   * making IV and NV equal status should make maths accurate on 64 bit
+     platforms
+   * may speed up maths somewhat if pp_add and friends start to use
+     integers when possible instead of fp. (hopefully the overhead in
+     looking for SvIOK and checking for overflow will not outweigh the
+     fp to integer speedup)
+   * will slow down integer operations (callers of SvIV) on "inaccurate"
+     values, as the change from SvIOK to SvIOKp will cause a call into
+     sv_2iv each time rather than a macro access direct to the IV slot
+   * should speed up number->string conversion on integers as IV is
+     favoured when IV and NV equally accurate
+
+   ####################################################################
+   You had better be using SvIOK_notUV if you want an IV for arithmetic
+   SvIOK is true if (IV or UV), so you might be getting (IV)SvUV
+   SvUOK is true iff UV.
+   ####################################################################
+
+   Your mileage will vary depending your CPUs relative fp to integer
+   performance ratio.
+*/
+
+#ifndef NV_PRESERVES_UV
+#define IS_NUMBER_UNDERFLOW_IV 1
+#define IS_NUMBER_UNDERFLOW_UV 2
+#define IS_NUMBER_IV_AND_UV 2
+#define IS_NUMBER_OVERFLOW_IV 4
+#define IS_NUMBER_OVERFLOW_UV 5
+/* Hopefully your optimiser will consider inlining these two functions.  */
+STATIC int
+S_sv_2inuv_non_preserve (pTHX_ register SV *sv, I32 numtype) {
+    DEBUG_c(PerlIO_printf(Perl_debug_log,"sv_2inuv_non '%s', IV=0x%"UVxf" NV=%g inttype=%X\n", SvPVX(sv), SvIVX(sv), SvNVX(sv), numtype));
+    if ((UV)SvNVX(sv) <= (UV)IV_MAX) {
+	(void)SvIOKp_on(sv);
+	(void)SvNOKp_on(sv);
+	/* Within suitable range to fit in an IV,  atol won't overflow */
+	/* XXX quite sure? Is that your final answer?  */
+	SvIVX(sv) = (IV)Atol(SvPVX(sv));
+	if (numtype & IS_NUMBER_NOT_INT) {
+	    /* I believe that even if the original PV had decimals, they
+	       are lost beyond the limit of the FP precision.
+	       However, neither is canonical, so both only get p flags.
+	       NWC, 2000/11/25 */
+	    /* Both already have p flags, so do nothing */
+	} else if (SvIVX(sv) == I_V(SvNVX(sv))) {
+	    SvNOK_on(sv);
+	    SvIOK_on(sv);
+	} else {
+	    SvIOK_on(sv);
+	    /* It had no "." so it must be integer.  assert (get in here from
+	       sv_2iv and sv_2uv only for ndef HAS_STRTOL and
+	       IS_NUMBER_AS_LONG_AS_IV_MAX) or my logic is faulty and all
+	       conversion routines need audit.  */
+	}
+	return SvNVX(sv) < 0 ? IS_NUMBER_UNDERFLOW_UV : IS_NUMBER_IV_AND_UV;
+    }
+    /* between IV_MAX and NV(UV_MAX). Could be slightly> UV_MAX */
+    (void)SvIOKp_on(sv);
+    (void)SvNOKp_on(sv);
+#ifdef HAS_STRTOUL
+    SvUVX(sv) = Strtoul(SvPVX(sv), Null(char**), 10);
+    if (numtype & IS_NUMBER_NOT_INT) {
+	SvIsUV_on(sv);
+    } else if (SvUVX(sv) == U_V(SvNVX(sv)) && SvUVX(sv) != UV_MAX) {
+	/* UV_MAX can cast up to NV (UV_MAX+1), that NV casts down to UV_MAX */
+	SvNOK_on(sv);
+	SvIOK_on(sv);
+	SvIsUV_on(sv);
+    } else {
+	/* As above, I believe UV at least as good as NV */
+	SvIOK_on(sv);
+	SvIsUV_on(sv);
+    }
+#else
+    /* We've just lost integer precision, nothing we could do.
+       UV and NV slots equally valid. */
+    SvNOK_on(sv); /* This can't have decimals */
+    SvIOK_on(sv); /* Should this be only p if IS_NUMBER_NOT_INT?? */
+    SvIsUV_on(sv);
+    SvUVX(sv) = U_V(SvNVX(sv));
+#endif /* HAS_STRTOUL */
+    return IS_NUMBER_OVERFLOW_IV;
+}
+
+/* For sv_2nv these three cases are "SvNOK and don't bother casting"  */
+S_sv_2iuv_non_preserve (pTHX_ register SV *sv, I32 numtype)
+{
+    DEBUG_c(PerlIO_printf(Perl_debug_log,"sv_2iuv_non '%s', IV=0x%"UVxf" NV=%g inttype=%X\n", SvPVX(sv), SvIVX(sv), SvNVX(sv), numtype));
+    if (SvNVX(sv) < (NV)IV_MIN) {
+	(void)SvIOKp_on(sv);
+	(void)SvNOK_on(sv);
+	SvIVX(sv) = IV_MIN;
+	return IS_NUMBER_UNDERFLOW_IV;
+    }
+    if (SvNVX(sv) > (NV)UV_MAX) {
+	(void)SvIOKp_on(sv);
+	(void)SvNOK_on(sv);
+	SvIsUV_on(sv);
+	SvUVX(sv) = UV_MAX;
+	return IS_NUMBER_OVERFLOW_UV;
+    }
+    if (!(numtype & (IS_NUMBER_TO_INT_BY_ATOL | IS_NUMBER_TO_INT_BY_STRTOL))) {
+	(void)SvIOKp_on(sv);
+	(void)SvNOK_on(sv);
+	/* Can't use strtol etc to convert this string */
+	if (SvNVX(sv) <= (UV)IV_MAX) {
+	    SvIVX(sv) = I_V(SvNVX(sv));
+	    if ((NV)(SvIVX(sv)) == SvNVX(sv)) {
+		SvIOK_on(sv); /* Integer is precise. NOK, IOK */
+	    } else {
+		/* Integer is imprecise. NOK, IOKp */
+	    }
+	    return SvNVX(sv) < 0 ? IS_NUMBER_UNDERFLOW_UV : IS_NUMBER_IV_AND_UV;
+	}
+	SvIsUV_on(sv);
+	SvUVX(sv) = U_V(SvNVX(sv));
+	if ((NV)(SvUVX(sv)) == SvNVX(sv)) {
+	    SvIOK_on(sv); /* Integer is precise. NOK, UOK */
+	} else {
+	    /* Integer is imprecise. NOK, IOKp */
+	}
+	return IS_NUMBER_OVERFLOW_IV;
+    }
+    return S_sv_2inuv_non_preserve (sv, numtype);
+}
+#endif /* NV_PRESERVES_UV*/
+
+
 IV
 Perl_sv_2iv(pTHX_ register SV *sv)
 {
@@ -1516,19 +1693,54 @@
 	}
     }
     if (SvNOKp(sv)) {
-	/* We can cache the IV/UV value even if it not good enough
-	 * to reconstruct NV, since the conversion to PV will prefer
-	 * NV over IV/UV.
-	 */
+	/* erm. not sure. *should* never get NOKp (without NOK) from sv_2nv
+	 * without also getting a cached IV/UV from it at the same time
+	 * (ie PV->NV conversion should detect loss of accuracy and cache
+	 * IV or UV at same time to avoid this.  NWC */
 
 	if (SvTYPE(sv) == SVt_NV)
 	    sv_upgrade(sv, SVt_PVNV);
 
-	(void)SvIOK_on(sv);
-	if (SvNVX(sv) < (NV)IV_MAX + 0.5)
+	(void)SvIOKp_on(sv);	/* Must do this first, to clear any SvOOK */
+	if (SvNVX(sv) < (NV)IV_MAX + 0.5) {
 	    SvIVX(sv) = I_V(SvNVX(sv));
+	    if (SvNVX(sv) == (NV) SvIVX(sv)) {
+		SvIOK_on(sv);  /* Can this go wrong with rounding? NWC */
+		DEBUG_c(PerlIO_printf(Perl_debug_log,
+				      "0x%"UVxf" iv(%g => %"IVdf") (precise)\n",
+				      PTR2UV(sv),
+				      SvNVX(sv),
+				      SvIVX(sv)));
+
+	    } else {
+		/* IV not precise.  No need to convert from PV, as NV
+		   conversion would already have cached IV if it detected
+		   that PV->IV would be better than PV->NV->IV
+		   flags already correct - don't set public IOK.  */
+		DEBUG_c(PerlIO_printf(Perl_debug_log,
+				      "0x%"UVxf" iv(%g => %"IVdf") (imprecise)\n",
+				      PTR2UV(sv),
+				      SvNVX(sv),
+				      SvIVX(sv)));
+	    }
+	    /* Can the above go wrong if SvIVX == IV_MIN and SvNVX < IV_MIN,
+	       but the cast (NV)IV_MIN rounds to a the value less (more
+	       negative) than IV_MIN which happens to be equal to SvNVX ??
+	       Analogous to 0xFFFFFFFFFFFFFFFF rounding up to NV (2**64) and
+	       NV rounding back to 0xFFFFFFFFFFFFFFFF, so UVX == UV(NVX) and
+	       (NV)UVX == NVX are both true, but the values differ. :-(
+	       Hopefully for 2s complement IV_MIN is something like
+	       0x8000000000000000 which will be exact. NWC */
+	} 
 	else {
 	    SvUVX(sv) = U_V(SvNVX(sv));
+	    if (
+#ifndef  NV_PRESERVES_UV
+		/* Make sure it's not 0xFFFFFFFFFFFFFFFF */
+		(SvUVX(sv) != UV_MAX) &&
+#endif
+		(SvNVX(sv) == (NV) SvUVX(sv)))
+		SvIOK_on(sv);
 	    SvIsUV_on(sv);
 	  ret_iv_max:
 	    DEBUG_c(PerlIO_printf(Perl_debug_log,
@@ -1548,46 +1760,109 @@
 	
 	   This means that if we cache such an IV, we need to cache the
 	   NV as well.  Moreover, we trade speed for space, and do not
-	   cache the NV if not needed.
+	   cache the NV if we are sure it's not needed.
 	 */
-	if (numtype & IS_NUMBER_NOT_IV) {
-	    /* May be not an integer.  Need to cache NV if we cache IV
-	     * - otherwise future conversion to NV will be wrong.  */
-	    NV d;
-
-	    d = Atof(SvPVX(sv));
-
-	    if (SvTYPE(sv) < SVt_PVNV)
-		sv_upgrade(sv, SVt_PVNV);
-	    SvNVX(sv) = d;
-	    (void)SvNOK_on(sv);
+
+	if ((numtype & ~IS_NUMBER_NEG) == IS_NUMBER_TO_INT_BY_ATOL) {
+	    /* The NV may be reconstructed from IV - safe to cache IV,
+		which may be calculated by atol(). */
+	    if (SvTYPE(sv) < SVt_PVIV)
+		sv_upgrade(sv, SVt_PVIV);
 	    (void)SvIOK_on(sv);
+	    SvIVX(sv) = Atol(SvPVX(sv));
+	} else {
+#ifdef HAS_STRTOL
+	    IV i;
+	    int save_errno = errno;
+	    /* Is it an integer that we could convert with strtol?
+	       So try it, and if it doesn't set errno then it's pukka.
+	       This should be faster than going atof and then thinking.  */
+	    if ((numtype & (IS_NUMBER_TO_INT_BY_STRTOL | IS_NUMBER_NOT_INT)
+		 == IS_NUMBER_TO_INT_BY_STRTOL)
+		&& (errno = 0, i = Strtol(SvPVX(sv), Null(char**), 10),
+		    errno == 0)) {
+		if (SvTYPE(sv) < SVt_PVIV)
+		    sv_upgrade(sv, SVt_PVIV);
+		(void)SvIOK_on(sv);
+		SvIVX(sv) = i;
+		errno = save_errno;
+	    } else {
+		NV d;
+		/* Hopefully trace flow will optimise this away where possible
+		 */
+		errno = save_errno;
+#else
+		NV d;
+#endif
+		/* It wasn't an integer, or it overflowed, or we don't have
+		   strtol. Do things the slow way - check if it's a UV etc. */
+		d = Atof(SvPVX(sv));
+
+		if (SvTYPE(sv) < SVt_PVNV)
+		    sv_upgrade(sv, SVt_PVNV);
+		SvNVX(sv) = d;
+
+		if (! numtype && ckWARN(WARN_NUMERIC))
+		    not_a_number(sv);
+
 #if defined(USE_LONG_DOUBLE)
-	    DEBUG_c(PerlIO_printf(Perl_debug_log, "0x%"UVxf" 2nv(%" PERL_PRIgldbl ")\n",
-				  PTR2UV(sv), SvNVX(sv)));
+		DEBUG_c(PerlIO_printf(Perl_debug_log, "0x%"UVxf" 2iv(%" PERL_PRIgldbl ")\n",
+				      PTR2UV(sv), SvNVX(sv)));
 #else
-	    DEBUG_c(PerlIO_printf(Perl_debug_log, "0x%"UVxf" 2nv(%g)\n",
-				  PTR2UV(sv), SvNVX(sv)));
+		DEBUG_c(PerlIO_printf(Perl_debug_log, "0x%"UVxf" 2iv(%g)\n",
+				      PTR2UV(sv), SvNVX(sv)));
 #endif
-	    if (SvNVX(sv) < (NV)IV_MAX + 0.5)
-		SvIVX(sv) = I_V(SvNVX(sv));
-	    else {
-		SvUVX(sv) = U_V(SvNVX(sv));
-		SvIsUV_on(sv);
-		goto ret_iv_max;
+
+
+#ifdef NV_PRESERVES_UV
+		(void)SvIOKp_on(sv);
+		(void)SvNOK_on(sv);
+		if (SvNVX(sv) < (NV)IV_MAX + 0.5) {
+		    SvIVX(sv) = I_V(SvNVX(sv));
+		    if ((NV)(SvIVX(sv)) == SvNVX(sv)) {
+			SvIOK_on(sv);
+		    } else {
+			/* Integer is imprecise. NOK, IOKp */
+		    }
+		    /* UV will not work better than IV */
+		} else {
+		    if (SvNVX(sv) > (NV)UV_MAX) {
+			SvIsUV_on(sv);
+			/* Integer is inaccurate. NOK, IOKp, is UV */
+			SvUVX(sv) = UV_MAX;
+			SvIsUV_on(sv);
+		    } else {
+			SvUVX(sv) = U_V(SvNVX(sv));
+			/* 0xFFFFFFFFFFFFFFFF not an issue in here */
+			if ((NV)(SvUVX(sv)) == SvNVX(sv)) {
+			    SvIOK_on(sv);
+			    SvIsUV_on(sv);
+			} else {
+			    /* Integer is imprecise. NOK, IOKp, is UV */
+			    SvIsUV_on(sv);
+			}
+		    }
+		    goto ret_iv_max;
+		}
+#else /* NV_PRESERVES_UV */
+		if (((UV)1 << NV_PRESERVES_UV_BITS) >
+		    (UV) (SvNVX(sv) > 0 ? SvNVX(sv) : -SvNVX(sv))) {
+		    /* Small enough to preserve all bits. */
+		    (void)SvIOKp_on(sv);
+		    SvNOK_on(sv);
+		    SvIVX(sv) = I_V(SvNVX(sv));
+		    if ((NV)(SvIVX(sv)) == SvNVX(sv))
+			SvIOK_on(sv);
+		    /* Assumption: first non-preserved integer is < IV_MAX,
+		       this NV is in the preserved range, therefore: */
+		    assert (U_V(SvNVX(sv)) < (UV)IV_MAX);
+		} else if (sv_2iuv_non_preserve (sv, numtype)
+			   >= IS_NUMBER_OVERFLOW_IV)
+		    goto ret_iv_max;
+#endif /* NV_PRESERVES_UV */
 	    }
 	}
-	else {	/* The NV may be reconstructed from IV - safe to cache IV,
-		   which may be calculated by atol(). */
-	    if (SvTYPE(sv) < SVt_PVIV)
-		sv_upgrade(sv, SVt_PVIV);
-	    (void)SvIOK_on(sv);
-	    SvIVX(sv) = Atol(SvPVX(sv));
-	    if (! numtype && ckWARN(WARN_NUMERIC))
-		not_a_number(sv);
-	}
-    }
-    else  {
+    } else  {
 	dTHR;
 	if (ckWARN(WARN_UNINITIALIZED) && !PL_localizing && !(SvFLAGS(sv) & SVs_PADTMP))
 	    report_uninit();
@@ -1647,19 +1922,34 @@
 	}
     }
     if (SvNOKp(sv)) {
-	/* We can cache the IV/UV value even if it not good enough
-	 * to reconstruct NV, since the conversion to PV will prefer
-	 * NV over IV/UV.
-	 */
+	/* erm. not sure. *should* never get NOKp (without NOK) from sv_2nv
+	 * without also getting a cached IV/UV from it at the same time
+	 * (ie PV->NV conversion should detect loss of accuracy and cache
+	 * IV or UV at same time to avoid this.  NWC */
 	if (SvTYPE(sv) == SVt_NV)
 	    sv_upgrade(sv, SVt_PVNV);
-	(void)SvIOK_on(sv);
+	(void)SvIOKp_on(sv);
 	if (SvNVX(sv) >= -0.5) {
 	    SvIsUV_on(sv);
 	    SvUVX(sv) = U_V(SvNVX(sv));
+	    if (
+#ifndef  NV_PRESERVES_UV
+		/* Make sure it's not 0xFFFFFFFFFFFFFFFF */
+		(SvUVX(sv) != UV_MAX) &&
+#endif
+		SvNVX(sv) == (NV) SvUVX(sv))
+		SvIOK_on(sv);
+	    else {
+		/* UV not precise.  No need to convert from PV, as NV
+		   conversion would already have cached UV if it detected
+		   that PV->UV would be better than PV->NV->UV
+		   flags already correct - don't set public IOK.  */
+	    }
 	}
 	else {
 	    SvIVX(sv) = I_V(SvNVX(sv));
+	    if (SvNVX(sv) == (NV) SvIVX(sv))
+		SvIOK_on(sv);
 	  ret_zero:
 	    DEBUG_c(PerlIO_printf(Perl_debug_log,
 				  "0x%"UVxf" 2uv(%"IVdf" => %"IVdf") (as signed)\n",
@@ -1680,68 +1970,102 @@
 	   NV as well.  Moreover, we trade speed for space, and do not
 	   cache the NV if not needed.
 	 */
-	if (numtype & IS_NUMBER_NOT_IV) {
-	    /* May be not an integer.  Need to cache NV if we cache IV
-	     * - otherwise future conversion to NV will be wrong.  */
-	    NV d;
-
-	    d = Atof(SvPVX(sv));
-
-	    if (SvTYPE(sv) < SVt_PVNV)
-		sv_upgrade(sv, SVt_PVNV);
-	    SvNVX(sv) = d;
-	    (void)SvNOK_on(sv);
-	    (void)SvIOK_on(sv);
-#if defined(USE_LONG_DOUBLE)
-	    DEBUG_c(PerlIO_printf(Perl_debug_log,
-				  "0x%"UVxf" 2nv(%" PERL_PRIgldbl ")\n",
-				  PTR2UV(sv), SvNVX(sv)));
-#else
-	    DEBUG_c(PerlIO_printf(Perl_debug_log,
-				  "0x%"UVxf" 2nv(%g)\n",
-				  PTR2UV(sv), SvNVX(sv)));
-#endif
-	    if (SvNVX(sv) < -0.5) {
-		SvIVX(sv) = I_V(SvNVX(sv));
-		goto ret_zero;
-	    } else {
-		SvUVX(sv) = U_V(SvNVX(sv));
-		SvIsUV_on(sv);
-	    }
-	}
-	else if (numtype & IS_NUMBER_NEG) {
+
+	if ((numtype & ~IS_NUMBER_NEG) == IS_NUMBER_TO_INT_BY_ATOL) {
 	    /* The NV may be reconstructed from IV - safe to cache IV,
-	       which may be calculated by atol(). */
-	    if (SvTYPE(sv) == SVt_PV)
-		sv_upgrade(sv, SVt_PVIV);
-	    (void)SvIOK_on(sv);
-	    SvIVX(sv) = (IV)Atol(SvPVX(sv));
-	}
-	else if (numtype) {		/* Non-negative */
-	    /* The NV may be reconstructed from UV - safe to cache UV,
-	       which may be calculated by strtoul()/atol. */
-	    if (SvTYPE(sv) == SVt_PV)
+		which may be calculated by atol(). */
+	    if (SvTYPE(sv) < SVt_PVIV)
 		sv_upgrade(sv, SVt_PVIV);
 	    (void)SvIOK_on(sv);
-	    (void)SvIsUV_on(sv);
+	    SvIVX(sv) = Atol(SvPVX(sv));
+	} else {
 #ifdef HAS_STRTOUL
-	    SvUVX(sv) = Strtoul(SvPVX(sv), Null(char**), 10);
-#else			/* no atou(), but we know the number fits into IV... */
-	    		/* The only problem may be if it is negative... */
-	    SvUVX(sv) = (UV)Atol(SvPVX(sv));
+	    UV u;
+	    int save_errno = errno;
+	    /* Is it an integer that we could convert with strtoul?
+	       So try it, and if it doesn't set errno then it's pukka.
+	       This should be faster than going atof and then thinking.  */
+	    if ((numtype & (IS_NUMBER_TO_INT_BY_STRTOL | IS_NUMBER_NOT_INT)
+		 == IS_NUMBER_TO_INT_BY_STRTOL)
+		&& (errno = 0, u = Strtoul(SvPVX(sv), Null(char**), 10),
+		    errno == 0)) {
+		if (SvTYPE(sv) < SVt_PVIV)
+		    sv_upgrade(sv, SVt_PVIV);
+		(void)SvIOK_on(sv);
+		SvIsUV_on(sv);
+		SvIVX(sv) = u;
+		errno = save_errno;
+	    } else {
+		NV d;
+		/* Hopefully trace flow will optimise this away where possible
+		 */
+		errno = save_errno;
+#else
+		NV d;
 #endif
-	}
-	else {				/* Not a number.  Cache 0. */
-	    dTHR;
+		/* It wasn't an integer, or it overflowed, or we don't have
+		   strtol. Do things the slow way - check if it's a IV etc. */
+		d = Atof(SvPVX(sv));
+
+		if (SvTYPE(sv) < SVt_PVNV)
+		    sv_upgrade(sv, SVt_PVNV);
+		SvNVX(sv) = d;
 
-	    if (SvTYPE(sv) < SVt_PVIV)
-		sv_upgrade(sv, SVt_PVIV);
-	    (void)SvIOK_on(sv);
-	    (void)SvIsUV_on(sv);
-	    SvUVX(sv) = 0;		/* We assume that 0s have the
-					   same bitmap in IV and UV. */
-	    if (ckWARN(WARN_NUMERIC))
-		not_a_number(sv);
+		if (! numtype && ckWARN(WARN_NUMERIC))
+		    not_a_number(sv);
+
+#if defined(USE_LONG_DOUBLE)
+		DEBUG_c(PerlIO_printf(Perl_debug_log, "0x%"UVxf" 2uv(%" PERL_PRIgldbl ")\n",
+				      PTR2UV(sv), SvNVX(sv)));
+#else
+		DEBUG_c(PerlIO_printf(Perl_debug_log, "0x%"UVxf" 2uv(%g)\n",
+				      PTR2UV(sv), SvNVX(sv)));
+#endif
+
+
+#ifdef NV_PRESERVES_UV
+		(void)SvIOKp_on(sv);
+		(void)SvNOK_on(sv);
+		if (SvNVX(sv) > -0.5) {
+		    SvUVX(sv) = U_V(SvNVX(sv));
+		    if ((NV)(SvUVX(sv)) == SvNVX(sv)) {
+			SvIsUV_on(sv);
+			SvIOK_on(sv);
+		    } else {
+			SvIsUV_on(sv);
+			/* Integer is imprecise. NOK, IOKp, is UV */
+		    }
+		    /* IV will not work better than UV */
+		} else {
+		    if (SvNVX(sv) < (NV)IV_MIN - 0.5) {
+			/* Integer is inaccurate. NOK, IOKp */
+			SvIVX(sv) = IV_MIN;
+		    } else {
+			SvIVX(sv) = I_V(SvNVX(sv));
+			if ((NV)(SvIVX(sv)) == SvNVX(sv)) {
+			    SvIOK_on(sv);
+			} else {
+			    /* Integer is imprecise. NOK, IOKp */
+			}
+		    }
+		    goto ret_zero;
+		}
+#else /* NV_PRESERVES_UV */
+		if (((UV)1 << NV_PRESERVES_UV_BITS) >
+		    (UV) (SvNVX(sv) > 0 ? SvNVX(sv) : -SvNVX(sv))) {
+		    /* Small enough to preserve all bits. */
+		    (void)SvIOKp_on(sv);
+		    SvNOK_on(sv);
+		    SvIsUV_on(sv);
+		    SvUVX(sv) = U_V(SvNVX(sv));
+		    /* 0xFFFFFFFFFFFFFFFF can't be an issue here */
+		    if ((NV)(SvUVX(sv)) == SvNVX(sv))
+			SvIOK_on(sv);
+		} else if (sv_2iuv_non_preserve (sv, numtype)
+			   <= IS_NUMBER_UNDERFLOW_UV)
+		    goto ret_zero;
+#endif /* NV_PRESERVES_UV */
+	    }
 	}
     }
     else  {
@@ -1834,12 +2158,53 @@
 	    (!SvPOKp(sv) || !strchr(SvPVX(sv),'.') || !looks_like_number(sv)))
     {
 	SvNVX(sv) = SvIsUV(sv) ? (NV)SvUVX(sv) : (NV)SvIVX(sv);
+#ifdef NV_PRESERVES_UV
+	SvNOK_on(sv);
+#else
+	/* Only set the public NV OK flag if this NV preserves the IV  */
+	/* Check it's not 0xFFFFFFFFFFFFFFFF */
+	if (SvIsUV(sv) ? ((SvUVX(sv) != UV_MAX)&&(SvUVX(sv) == U_V(SvNVX(sv))))
+		       : (SvIVX(sv) == I_V(SvNVX(sv))))
+	    SvNOK_on(sv);
+	else
+	    SvNOKp_on(sv);
+#endif
     }
     else if (SvPOKp(sv) && SvLEN(sv)) {
 	dTHR;
 	if (ckWARN(WARN_NUMERIC) && !SvIOKp(sv) && !looks_like_number(sv))
 	    not_a_number(sv);
 	SvNVX(sv) = Atof(SvPVX(sv));
+#ifdef NV_PRESERVES_UV
+	SvNOK_on(sv);
+#else
+	/* Only set the public NV OK flag if this NV preserves the value in
+	   the PV at least as well as an IV/UV would.
+	   Not sure how to do this 100% reliably. */
+	/* if that shift count is out of range then Configure's test is
+	   wonky. We shouldn't be in here with NV_PRESERVES_UV_BITS ==
+	   UV_BITS */
+	if (((UV)1 << NV_PRESERVES_UV_BITS) >
+	    (UV) (SvNVX(sv) > 0 ? SvNVX(sv) : -SvNVX(sv)))
+	    SvNOK_on(sv); /* Definitely small enough to preserve all bits */
+	else if (SvNVX(sv) < (NV)IV_MIN || SvNVX(sv) > (NV)UV_MAX) {
+		/* Definitely too large/small to fit in an integer, so no loss
+		   of precision going to integer in the future via NV */
+	    SvNOK_on(sv);
+	} else {
+	    /* Is it something we can run through strtol etc (ie no
+	       trailing exponent part)? */
+	    int numtype = looks_like_number(sv);
+	    /* XXX probably should cache this if called above */
+
+	    if (!(numtype &
+		  (IS_NUMBER_TO_INT_BY_ATOL | IS_NUMBER_TO_INT_BY_STRTOL))) {
+		/* Can't use strtol etc to convert this string, so don't try */
+		SvNOK_on(sv);
+	    } else
+		sv_2inuv_non_preserve (sv, numtype);
+	}
+#endif /* NV_PRESERVES_UV */
     }
     else  {
 	dTHR;
@@ -1847,10 +2212,11 @@
 	    report_uninit();
 	if (SvTYPE(sv) < SVt_NV)
 	    /* Typically the caller expects that sv_any is not NULL now.  */
+	    /* XXX Ilya implies that this is a bug in callers that assume this
+	       and ideally should be fixed.  */
 	    sv_upgrade(sv, SVt_NV);
 	return 0.0;
     }
-    SvNOK_on(sv);
 #if defined(USE_LONG_DOUBLE)
     DEBUG_c({
 	STORE_NUMERIC_LOCAL_SET_STANDARD();
@@ -1905,23 +2271,32 @@
 
 /*
  * Returns a combination of (advisory only - can get false negatives)
- * 	IS_NUMBER_TO_INT_BY_ATOL, IS_NUMBER_TO_INT_BY_ATOF, IS_NUMBER_NOT_IV,
- *	IS_NUMBER_NEG
+ * IS_NUMBER_TO_INT_BY_ATOL, IS_NUMBER_TO_INT_BY_ATOF
+ * IS_NUMBER_LONGER_THAN_IV_MAX, IS_NUMBER_AS_LONG_AS_IV_MAX
+ * IS_NUMBER_NOT_INT, IS_NUMBER_NEG, IS_NUMBER_INFINITY
  * 0 if does not look like number.
  *
- * In fact possible values are 0 and
- * IS_NUMBER_TO_INT_BY_ATOL				123
- * IS_NUMBER_TO_INT_BY_ATOL | IS_NUMBER_NOT_IV		123.1
- * IS_NUMBER_TO_INT_BY_ATOF | IS_NUMBER_NOT_IV		123e0
+ * (atol and strtol stop when they hit a decimal point. strtol will return
+ * LONG_MAX and LONG_MIN when given out of range values. ANSI says they should
+ * do this, and vendors have had 11 years to get it right.
+ * However, will try to make it still work with only atol
+ *  
+ * IS_NUMBER_TO_INT_BY_ATOL	123456789 or 123456789.3  definitely < IV_MAX
+ * IS_NUMBER_TO_INT_BY_STRTOL	123456789 or 123456789.3  if digits = IV_MAX
+ * IS_NUMBER_TO_INT_BY_ATOF	123456789e0               or >> IV_MAX
+ * IS_NUMBER_LONGER_THAN_IV_MAX	  lots of digits, don't bother with atol
+ * IS_NUMBER_AS_LONG_AS_IV_MAX	  atol might hit LONG_MAX, might not.
+ * IS_NUMBER_NOT_INT		saw "." or "e"
+ * IS_NUMBER_NEG
  * IS_NUMBER_INFINITY
- * with a possible addition of IS_NUMBER_NEG.
  */
 
 /*
 =for apidoc looks_like_number
 
 Test if an the content of an SV looks like a number (or is a
-number).
+number). C<Inf> and C<Infinity> are treated as numbers (so will not
+issue a non-numeric warning), even if your atof() doesn't grok them.
 
 =cut
 */
@@ -1959,9 +2334,10 @@
 
     nbegin = s;
     /*
-     * we return IS_NUMBER_TO_INT_BY_ATOL if the number can be converted
-     * to _integer_ with atol() and IS_NUMBER_TO_INT_BY_ATOF if you need
-     * (int)atof().
+     * we return IS_NUMBER_TO_INT_BY_ATOL if the number can converted to 
+     * integer with atol() without overflow, IS_NUMBER_TO_INT_BY_STRTOL if
+     * possibly slightly larger than max int, IS_NUMBER_TO_INT_BY_ATOF if you
+     * will need (int)atof().
      */
 
     /* next must be digit or the radix separator or beginning of infinity */
@@ -1970,8 +2346,10 @@
 	    s++;
         } while (isDIGIT(*s));
 
-	if (s - nbegin >= TYPE_DIGITS(IV))	/* Cannot cache ato[ul]() */
-	    numtype |= IS_NUMBER_TO_INT_BY_ATOF | IS_NUMBER_NOT_IV;
+	if (s - nbegin > TYPE_DIGITS(IV))	/* Cannot cache ato[ul]() */
+	    numtype |= IS_NUMBER_TO_INT_BY_ATOF | IS_NUMBER_LONGER_THAN_IV_MAX;
+	if (s - nbegin == TYPE_DIGITS(IV))	/* Can't be sure either way */
+	    numtype |= IS_NUMBER_TO_INT_BY_STRTOL | IS_NUMBER_AS_LONG_AS_IV_MAX;
 	else
 	    numtype |= IS_NUMBER_TO_INT_BY_ATOL;
 
@@ -1981,7 +2359,7 @@
 #endif
 	    ) {
 	    s++;
-	    numtype |= IS_NUMBER_NOT_IV;
+	    numtype |= IS_NUMBER_NOT_INT;
             while (isDIGIT(*s))  /* optional digits after the radix */
                 s++;
         }
@@ -1992,7 +2370,7 @@
 #endif
 	    ) {
         s++;
-	numtype |= IS_NUMBER_TO_INT_BY_ATOL | IS_NUMBER_NOT_IV;
+	numtype |= IS_NUMBER_TO_INT_BY_ATOL | IS_NUMBER_NOT_INT;
         /* no digits before the radix means we need digits after it */
         if (isDIGIT(*s)) {
 	    do {
@@ -2010,6 +2388,7 @@
 	    s++; if (*s != 'I' && *s != 'i') return 0;
 	    s++; if (*s != 'T' && *s != 't') return 0;
 	    s++; if (*s != 'Y' && *s != 'y') return 0;
+	    s++;
 	}
 	sawinf = 1;
     }
@@ -2017,12 +2396,13 @@
         return 0;
 
     if (sawinf)
-	numtype = IS_NUMBER_INFINITY;
+	numtype = (numtype & IS_NUMBER_NEG) /* Keep track of sign  */
+	  | IS_NUMBER_INFINITY | IS_NUMBER_NOT_INT;
     else {
 	/* we can have an optional exponent part */
 	if (*s == 'e' || *s == 'E') {
-	    numtype &= ~IS_NUMBER_NEG;
-	    numtype |= IS_NUMBER_TO_INT_BY_ATOF | IS_NUMBER_NOT_IV;
+	    numtype &= IS_NUMBER_NEG;
+	    numtype |= IS_NUMBER_TO_INT_BY_ATOF | IS_NUMBER_NOT_INT;
 	    s++;
 	    if (*s == '+' || *s == '-')
 		s++;
@@ -2216,11 +2596,33 @@
 	    return "";
 	}
     }
-    if (SvNOKp(sv)) {			/* See note in sv_2uv() */
-	/* XXXX 64-bit?  IV may have better precision... */
-	/* I tried changing this to be 64-bit-aware and
-	 * the t/op/numconvert.t became very, very, angry.
-	 * --jhi Sep 1999 */
+    if (SvIOK(sv) || (SvIOKp(sv) && !SvNOKp(sv))) {
+	/* I'm assuming that if both IV and NV are equally valid then
+	   converting the IV is going to be more efficient */
+	U32 isIOK = SvIOK(sv);
+	U32 isUIOK = SvIsUV(sv);
+	char buf[TYPE_CHARS(UV)];
+	char *ebuf, *ptr;
+
+	if (SvTYPE(sv) < SVt_PVIV)
+	    sv_upgrade(sv, SVt_PVIV);
+	if (isUIOK)
+	    ptr = uiv_2buf(buf, 0, SvUVX(sv), 1, &ebuf);
+	else
+	    ptr = uiv_2buf(buf, SvIVX(sv), 0, 0, &ebuf);
+	SvGROW(sv, ebuf - ptr + 1);	/* inlined from sv_setpvn */
+	Move(ptr,SvPVX(sv),ebuf - ptr,char);
+	SvCUR_set(sv, ebuf - ptr);
+	s = SvEND(sv);
+	*s = '\0';
+	if (isIOK)
+	    SvIOK_on(sv);
+	else
+	    SvIOKp_on(sv);
+	if (isUIOK)
+	    SvIsUV_on(sv);
+    }
+    else if (SvNOKp(sv)) {
 	if (SvTYPE(sv) < SVt_PVNV)
 	    sv_upgrade(sv, SVt_PVNV);
 	/* The +20 is pure guesswork.  Configure test needed. --jhi */
@@ -2245,31 +2647,6 @@
 	if (s[-1] == '.')
 	    *--s = '\0';
 #endif
-    }
-    else if (SvIOKp(sv)) {
-	U32 isIOK = SvIOK(sv);
-	U32 isUIOK = SvIsUV(sv);
-	char buf[TYPE_CHARS(UV)];
-	char *ebuf, *ptr;
-
-	if (SvTYPE(sv) < SVt_PVIV)
-	    sv_upgrade(sv, SVt_PVIV);
-	if (isUIOK)
-	    ptr = uiv_2buf(buf, 0, SvUVX(sv), 1, &ebuf);
-	else
-	    ptr = uiv_2buf(buf, SvIVX(sv), 0, 0, &ebuf);
-	SvGROW(sv, ebuf - ptr + 1);	/* inlined from sv_setpvn */
-	Move(ptr,SvPVX(sv),ebuf - ptr,char);
-	SvCUR_set(sv, ebuf - ptr);
-	s = SvEND(sv);
-	*s = '\0';
-	if (isIOK)
-	    SvIOK_on(sv);
-	else
-	    SvIOKp_on(sv);
-	if (isUIOK)
-	    SvIsUV_on(sv);
-	SvPOK_on(sv);
     }
     else {
 	dTHR;
--- ../bleadperl/pp.h	Sat Sep  2 17:52:58 2000
+++ pp.h	Sat Nov 25 10:32:33 2000
@@ -126,6 +126,7 @@
 #endif
 
 #define TOPs		(*sp)
+#define TOPm1s		(*(sp-1))
 #define TOPp		(SvPV(TOPs, PL_na))		/* deprecated */
 #define TOPpx		(SvPV(TOPs, n_a))
 #define TOPn		(SvNV(TOPs))
@@ -137,6 +138,9 @@
 #define TOPq		((Quad_t)SvIV(TOPs))
 #define TOPuq		((Uquad_t)SvUV(TOPs))
 #endif
+
+/* Try to preserv IVness/UVness in basic arith ops. */
+#define PRESERVE_IVUV
 
 /* Go to some pains in the rare event that we must extend the stack. */
 
--- embed.pl.orig	Thu Nov 23 17:20:22 2000
+++ embed.pl	Sun Dec  3 15:22:54 2000
@@ -2430,6 +2430,10 @@
 #if defined(PERL_IN_SV_C) || defined(PERL_DECL_PROT)
 s	|IV	|asIV		|SV* sv
 s	|UV	|asUV		|SV* sv
+#  if !defined(NV_PRESERVES_UV)
+s	|int	|sv_2iuv_non_preserve		|SV* sv|I32 numtype
+s	|int	|sv_2inuv_non_preserve		|SV* sv|I32 numtype
+#  endif
 s	|SV*	|more_sv
 s	|void	|more_xiv
 s	|void	|more_xnv
--- sv.h.orig	Fri Oct 13 15:50:58 2000
+++ sv.h	Sat Nov 25 10:30:23 2000
@@ -448,6 +448,9 @@
 =for apidoc Am|void|SvIOK_UV|SV* sv
 Returns a boolean indicating whether the SV contains an unsigned integer.
 
+=for apidoc Am|void|SvUOK|SV* sv
+Returns a boolean indicating whether the SV contains an unsigned integer.
+
 =for apidoc Am|void|SvIOK_notUV|SV* sv
 Returns a boolean indicating whether the SV contains an signed integer.
 
@@ -562,6 +565,7 @@
 
 #define SvIOK_UV(sv)		((SvFLAGS(sv) & (SVf_IOK|SVf_IVisUV))	\
 				 == (SVf_IOK|SVf_IVisUV))
+#define SvUOK(sv)		SvIOK_UV(sv)
 #define SvIOK_notUV(sv)		((SvFLAGS(sv) & (SVf_IOK|SVf_IVisUV))	\
 				 == SVf_IOK)
 
--- ../bleadperl/t/lib/peek.t	Wed Aug 30 01:30:17 2000
+++ t/lib/peek.t	Sun Dec  3 19:28:32 2000
@@ -88,10 +88,10 @@
 
 do_test( 6,
         $c + $d,
-'SV = NV\\($ADDR\\) at $ADDR
+'SV = IV\\($ADDR\\) at $ADDR
   REFCNT = 1
-  FLAGS = \\(PADTMP,NOK,pNOK\\)
-  NV = 456');
+  FLAGS = \\(PADTMP,IOK,pIOK\\)
+  IV = 456');
 
 ($d = "789") += 0.1;
 
@@ -154,12 +154,10 @@
       FLAGS = \\(IOK,pIOK\\)
       IV = 123
     Elt No. 1
-    SV = PVNV\\($ADDR\\) at $ADDR
+    SV = IV\\($ADDR\\) at $ADDR
       REFCNT = 1
-      FLAGS = \\(IOK,NOK,pIOK,pNOK\\)
-      IV = 456
-      NV = 456
-      PV = 0');
+      FLAGS = \\(IOK,pIOK\\)
+      IV = 456');
 
 do_test(12,
        {$b=>$c},
@@ -180,12 +178,10 @@
     RITER = -1
     EITER = 0x0
     Elt "123" HASH = $ADDR
-    SV = PVNV\\($ADDR\\) at $ADDR
+    SV = IV\\($ADDR\\) at $ADDR
       REFCNT = 1
-      FLAGS = \\(IOK,NOK,pIOK,pNOK\\)
-      IV = 456
-      NV = 456
-      PV = 0');
+      FLAGS = \\(IOK,pIOK\\)
+      IV = 456');
 
 do_test(13,
         sub(){@_},

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About