Front page | perl.perl5.porters |
Postings from September 2013
Re: unstable mergesort?
From: John P. Linderman
September 12, 2013 17:39
Re: unstable mergesort?
Message ID: CAC0cEp-80KTu=V2UEz0kz+SagpBUuxdZpeCzfOhoPbbm0CFr9w@mail.gmail.com
On Thu, Sep 12, 2013 at 10:01 AM, Nicholas Clark <email@example.com> wrote:
> On Wed, Sep 11, 2013 at 12:01:39PM -0400, John P. Linderman wrote:
> > This "does the right thing" for all a and b, whether they are equal or
> > not. But it occurs to me that we could "ignore stability" by doing
> > if (stable) sensestab = -1;
> > else sensestab = 0;
> > at the start of the sort, and then replace the "else sense = -1;" above
> > else sense = sensestab;
> > Our comparison logic would then treat a as low to equal b, regardless of
> > which came first. It looks like things will be a little less trivial in
> > the identification of initial runs phase, but probably not too bad.
> > which
> If I understand your description correctly, it would be possible to do this
> within the body of S_mergesortsv() and support functions, so that the same
> source code is used for both the stable and unstable versions?
That is, indeed, the hope, based on far too little time giving the code the
consideration it deserves.
> > no stable;
> I think you meant to type no sort "stable";
Right again. -1 on letting that jpl guy anywhere near the internals :-)
> Even if the consensus is in favour of switching the default to
> > (faster) instability, I think we should document what the default is.
> > Currently we have the situation that everyone gets a stable sort
> > (unless they ask for an unstable one) but the default might change out
> > from under you, so if you need stability, you have to ask for it
> > anyway. It would be annoying to replicate that problem in the other
> > direction.
> In the short term, if the incremental cost of stability is small, I don't
> think that it's worth giving up.
> For the bigger picture, we've been defaulting to stability for about a
> decade. In that decade, I'm not aware of anyone coming up with a radically
> faster algorithm that can't guarantee stability, so I don't think it likely
> that anyone will in the next ten years. Plus we can always turn an unstable
> sort stable by using an array of pointers, like we currently do for qsort,
> for a cost that is small.
> So I think we should document sort as being stable by default, and that
> we're not going to change this
has always been a bit wishy-washy about stability
The default algorithm is mergesort, which
will be stable even if you do not explicitly demand it.
But the stability of the default sort is a side-effect
that could change in later versions. If stability is
important, be sure to say so with a
use sort 'stable';
We could claim "told you so!" if stability stopped being the default, but,
realistically, that would break lots of code, and that's not our style. So
I agree that stability should remain the default. As noted in my
stuttering countdown sequence, permission to ignore stability can radically
improve performance. The question is, "at what cost?". I think it's going
to ugly up the phase that sets up the initial runs. That will (probably)
make that phase even harder to understand and somewhat slower, although it
shouldn't increase the number of element comparisons. Fortunately, that's
just a "one time" expense, on an initial pass through the input. I don't
foresee any slowdown in the merge phase, which typically involves multiple
passes over the data. But it's past time for me to quit speculating and