Front page | perl.perl5.porters |
Postings from July 2017
Re: destabilizing mergesort
From: Father Chrysostomos
July 23, 2017 21:20
Re: destabilizing mergesort
Message ID: 6D510413-5782-45A1-9DFF-C654E5DBE708@cpan.org
John P. Linderman wrote:
> I agree with the comments (back in 2013!) that said that a stable sort
> *must* be the default, as it has been since mergesort was integrated into
> pp_sort.c. This is complicated by the quirky communication between the sort
> pragma, via op.c, to the code in pp_sort.c. Simply put, the SORTf_STABLE
> bit that qsort and mergesort see has *not* been interrogated by the merge
> sort. And it is, by default (in sort.pm and op.c) *not* set: stability has
> been *implicit* when mergesort is invoked. So if I allow mergesort to be
> unstable, how do we communicate that? I see three possibilities, none of
> them very good.
> - Change sort.pm and op.c so that SORTf_STABLE *is* set by default. The
> mergesort code would start interrogating that bit to determine if stability
> was demanded. The downside is that it would make qsort stable by default.
> If qsort finds SORTf_STABLE *on*, it will be much less efficient on the
> kind of input where qsort might have a performance advantage over
> mergesort. And turning stability *off* for qsort would be tricky. The
> _qsort subpragma cannot turn the $sort::stable_bit hint off, because that
> would make "use sort qw(stable _qsort);" or "use sort qw(stable); use sort
> qw(_qsort);" ineffective. Mind you, a stabilized quicksort will still
> return legitimate results, so it cannot break existing code. But it could
> slow things down.
> - Add a new subpragma and sort bit for "unstable". mergesort would
> continue to ignore SORTf_STABLE, and instead test SORTf_UNSTABLE. qsort
> would continue to test (only) SORTf_STABLE, which would continue to be off
> by default. The downside is primarily the confusion of having two
> subpragmas for what seems like the same thing.
> - Maybe it's time to pull the plug on quicksort. It's a lot of code, and
> I suspect it is seldom invoked. "perldoc sort" has advised that subpragmas
> beginning with "_" may not persist beyond Perl 5.8, and we're long past
> that. We could "phase it out gently" using the first suggestion, with some
> suitable warning.
It has already been suggested in ticket #119635, but nobody could give a definitive answer. (No pumpking spoke up at the time.)
> This shouldn't be my choice alone. Maybe other porters have better
> suggestions. Meanwhile, I'll plod on with trying to make stability optional
> in mergesort, and worry about the details of expressing the option later.
I think it is as simply as leaving each algorithm to use its own default (stable for mergesort; unstable for qsort), but let ‘use/no sort 'stable'’ set it specifically for a given scope. So:
use sort 'stable'; # makes no difference, unless we are using qsort
no sort 'stable'; # make mergesort unstable
Internally we can have two flags, stable and unstable, but that implementation detail does not need to be exposed to the user.
The attached patch (untested) is one way this might be accomplished. Of course, it is up to the mergesort code to make use of the SORTf_WOBBLY flag.