develooper Front page | perl.perl5.porters | Postings from February 2001

Re: IV preservation (was Re: [PATCH 5.7.0] compiling on OS/2)

Thread Previous | Thread Next
From:
Nicholas Clark
Date:
February 15, 2001 14:56
Subject:
Re: IV preservation (was Re: [PATCH 5.7.0] compiling on OS/2)
Message ID:
20010215225456.A601@Bagpuss.unfortu.net
On Wed, Feb 14, 2001 at 03:07:29PM -0500, Ilya Zakharevich wrote:
> On Fri, Jan 05, 2001 at 03:50:07PM +0000, Nicholas Clark wrote:
> > On Sun, Dec 31, 2000 at 12:19:24AM -0500, Ilya Zakharevich wrote:
> > > This patch does not fix new bugs introduced by IV-preservation
> > > madness.  Also, the second chunk may cause problems for other
> > 
> > Why is it mad? I don't remember reading anything here from you saying why
> > you thought it was a bad idea or a bad implementation
> 
> Sorry that it took me so long.  Here are chunks of my discussion with
> Jarkko (with minor corrections):
>
> ==================================================================
> I checked one line of the code only - one where the segfault is
> happening on OS/2 (this segfault might be attributed to video drivers,
> so might be not Perl fault).  This one line indicates that when NV is
> used in the IV/Uv context, the converted value is not cached.
> 
> This line alone makes the approach deserving rejection.

I don't agree that it makes the *approach* deserving of rejection.
The implementation, yes. (in that the implementation needs to be perfect)
It was not intended never to cache the NV value if the NV value was
calculated from the string. This is intended to be the same approach used
as 5.6.0. In sv_2iv in 5.6.0

	else if (numtype) {
	    /* The NV may be reconstructed from IV - safe to cache IV,
	       which may be calculated by atol(). */
	    if (SvTYPE(sv) == SVt_PV)
		sv_upgrade(sv, SVt_PVIV);
	    (void)SvIOK_on(sv);
	    SvIVX(sv) = Atol(SvPVX(sv));
	}


If looks_like_number determines that the string is a valid integer, then
atol is called, and no NV cached here.

It's not obvious to me from your description where the wrong line is.
I'm quite likely to agree with you that it's stupid when I find it -
could you mail the offending section of code to the list please?

> ==================================================================
> The *only* case when this adds some new functionality is on the
> platforms with 64-bit floats and 64-bit integers.  These platforms are a tiny
> minority, and have a possiblity to use long floats intead.  So the
> patch is not *required*.

True, it is not required, even on most 64 bit platforms.
But without it perl is assuming that floating point can fully preserve
integers. So to maintain numeric sanity while using 64 bit integers one
has to use long doubles.
How many platforms have a full set of trigonometric functions at long
double precision? ANSI C89 doesn't supply them. It doesn't even supply
square root at anything other than double.

Hence is it a sound idea to have most floating point arithmetic working at
one precision, but dropping back to a lesser precision for certain functions?

I'm also unconvinced that long doubles are that well debugged.
Abigail appears to be having problems with a platform that can't round a
long double correctly to the nearest integer, and there is another platform
that is producing printf output inconsistent with any other.

64 bit integers are becoming more desirable, as file sizes increase - 
I can't see anyone trusting the reliability of positions or offsets in
files over 2Gb when stored in a floating point value.

Long doubles would increase the memory taken scalars even more - to use
64 bit integers with the 5.6 semantics one would need to switch from
doubles to long doubles to preserve numeric sanity, so one is looking at
8 or 12 bytes more per scalar, not just 4 (more for the 64 bit integer)
All this extra data to shuffle would make perl slower.

> ==================================================================
> The only non-functionality advantage this approach adds is the memory
> footprint one.  Again, when compensated by unavoidable slowdowns due
> to switching off caching, this should be negligeable.

I believe that the no-caching you refer to is a mistake of the implementation.

> [And do not try to push "make test" like benchmarking at me.  On

I agree it isn't. I keep moaning about what should be used instead.
(and then quote make test times. hypocritical)

> contemporary hardware the most (significant part of the?) time spent

From what I remember op/numconvert.t went faster with the patch.
op/numconvert has no sleep, and no fork. Unlike most of test suite.

> in "make test" is inside the sleep() calls!  Remember this slogan:
> "Buy Soviet watches, the quickiest-running watches in the world!"?]

:-)

> ==================================================================
> Many benchmarks failed to discover any measurable speed advantage of
> `use integer'.  Thus this approach will also not have a speed advantage
> even in the cases when effects of no-caching do not enter the picture.

I believe that the no-caching you refer to is a mistake of the implementation.

> ==================================================================
> The handling of floating point operations is not to be put into hands
> of unwashed masses.  I would not like anyone with less trust than
> what Kagan has to touch such topics (definitely not me!).  Witness all
> the goofs in FP operation of processors - even when designed by people
> who understand all the hairy issues of FP operations.
> 
> This patch wants to replace some operations done by the FPU by
> operations in software.  First of all, moving operations from hardware
> to software is at least a questionable practice.  Second, the

I believe that you misrepresent what this patch wants to do.
Or I misunderstand what you say.

It wants to move *integer* operations from FPU to "software" (read integer
hardware). All other operations remain in hardware.
I believe that this means that the only floating point operation involved
is determining if a floating point value is actually exactly an integer.
Why is this a problem?

[if this seems like a stupid question, I've not seen anyone else on the
list say anything that would give away *why* it's a stupid question.
Which would make me think that no-one else active is aware of any
reasons why it would be a problem]

> possibilities of bugs introduced by uneducated patching are limitless
> (I do not know, maybe Nicholas has the FPU-design qualification?
> Even in this case, expect very hard-to-find bugs).

No, I do not. My background is in Geophysics. This does not make me
qualified for such a task. This may scare you more.

> ==================================================================
> This patch greatly blows up the size of the C code to support
> arithmetic operations in Perl.  My estimates would be 1000x times
> harder maintainance of these opcodes.  Though only a small part of the
> total Perl core, such a significant change cannot be unnoticed.

I agree. The estimate of maintenance is tricky. I would expect that most
opcodes (except pp_modulo and pp_divide) have been untouched since written.
pp_modulo is the only necessarily complicated operation (and hence may
harbour bugs)
pp_divide appears to have been the only one with any serious "maintenance",
in as much as it has a work around for sloppy division on one platform

> ==================================================================
> These were my feelings when I saw this announcement on the digest
> (plus what I get by trying to find a bug in 10min).  I had no doubt
> that given so many reasonable people on p5p (including those who work
> in the processor design!) this patch would be rejected.  Had I known, I
> would try to find time to voice my opinion earlier.

perl5-porters is quieter than it used to be. People like yourself, Tim Bunce
Graham Barr and Tom Christiansen are much less active than I remember
you/they used to be.

I may get this wrong (it is from memory) but I believe that the majority
of those still active now who were involved in the discussion of your
Sane numconvert'ing patch to 5.005_56 at that time expressed a desire to
have sane numbers using 64 bit IVs. Whilst this could be achieved with
the structure of 5.6 code, I am under the impression that they were
desiring 64 bit IVs without moving NVs to long doubles.
[And I believe that at that time being able to moving NVs to long doubles
was work in progress]

I think you sometimes underestimate the capabilities/knowledge of most
other people on the list (however "reasonable" they may be).
I believe that most people here just don't have the specific knowledge that
you have, and hence aren't able to foresee the problems which are obvious
to you.

I see your patches for the regular expression engine ending with something
similar to "XXX is left as an exercise to the reader" and I think
"help, where would I start?". I have no idea how the innards of regular
expression engine (or the tokeniser or lexer for that matter) work, despite
having been looking at the perl source code for 5 or so years.
[probably because they are the parts most isolated from the operating system,
and hence least likely to cause problems in porting, and because they are
by necessity the least modular, the most intertwined]

So if perl hackers of several years can't follow the nuances of your
reasoning, what hope for the newly started?

[There is a danger that the above may be misinterpreted. I don't mean this
as criticism. I believe that your explanations *are* pitched at the correct
level. I just wanted to explain why I suspect that most people on the list
haven't got to that level yet.]

> ==================================================================
> On the scale of disasters in the Perl design, IMO this is less than
> the qu// horror, but on par with forcing h2xs to produce a

From the documentation of qu// in perlop, I fail to see why it's anywhere
near as horrible as changes to the numeric operators. I would have thought
that qu// was less bad, not the other way round.

> Hope this helps,

Yes. It does. It does answer my question "why is it mad?"
Thanks.
However, it causes me to ask more questions, and it doesn't (yet) convince
me that the approach (not the implementation) has more disadvantages than
advantages.

Nicholas Clark

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About