develooper Front page | perl.perl5.porters | Postings from July 2021

Re: Elevator pitch, deprecating $a/$b sort globals by using sub{}

Thread Previous | Thread Next
From:
Smylers
Date:
July 6, 2021 20:32
Subject:
Re: Elevator pitch, deprecating $a/$b sort globals by using sub{}
Message ID:
20210706203209.GA61371@stripey.com
Salvador Fandiño writes:

> Reducing key comparison to just numeric or string comparison is not
> flexible enough (or at least not user-friendly enough).

True, though other objects can override <=> or cmp, which provides the
opportunity for any custom sorting you want.

> For a start you want to support having multiple keys, every one with
> its own comparison operation and direction.

Good point. But it still seems a shame to have to specify each key
twice, once for $a and once for $b.

> BTW, regarding direction, sorting and reversing is not equivalent to
> sorting in the opposite direction.

Oh, I thought that Perl special-cased C<reverse sort> so that it *is*
equivalent to sorting in the opposite direction. Have you an example
where it isn't?

> So if you provide "numsort" you also need "descnumsort".

I explicitly suggesting using reverse to avoid a proliferation of
functions.

> Finally, check Sort::Key family of modules. They support almost
> everything related to key based sorting but at the expense of
> providing a myriad of functions in order to cover all the different
> key type and direction combinations.

Yeah, that's too many functions. But, looking specifically at the main
Sort::Key module: https://metacpan.org/pod/Sort::Key

• The various r*sort functions should be handle-able by reverse.
• In-place sorting is a different matter to that being discussed in this
  thread, so the *sort_inplace functions aren't relevant here.
• Numerically sorting specifically as signed or unsigned integers is a
  niche requirement that doesn't fit well with core Perl, which mostly
  just deals with numbers but nothing more specific, so the *i*sort and
  *u*sort functions are probably not relevant to this thread either.

That leaves:

• keysort, which is equivalent to what I called asort above
• nkeysort, equivalent to what I called nsort above
• nsort, which is just nkeysort without a block; given that Perl's
  existing sort function manages to have an optional block, a core
  nsort presumably could as well, meaning that nkeysort and nsort could
  be just one function

So that's two functions. Which isn't one, but nor is it the myriad that
Sort::Key provides, while still covering many of its scenarios. It may
be that that's *enough* common cases that it's still worth doing.

As you point out above, it doesn't cover the Sort::Key::Maker
functionality of sorting by, say, descending age and then alphabetical
names where age is equal. With core sort that currently involves
something like:

    sort { $b->age <=> $a->age || $a->name cmp $b->name } @list

Or with that module:

    use Sort::Key::Maker
        sort_year_groups => sub { $_->age, $_->name }, qw<int string>;
    sort_year_groups @list;

Providing that with a built-in sort function and not repeating $a and $b
would require something *equivalent to*:

    ksort [-n => sub { $_->age }, a => sub { $_->name }], @list;

or:

    ksort sub :numeric:desc { $_->age }, sub { $_->name }, \@list;

That is, a list of pairs of type/direction indicators and
keys, followed by the list of things to sort.

But there isn't really a syntax for that which fits in with any other
core function:

• It needs two separate lists: the spec list, then the list of items. I
  don't think there's any other core function that handles multiple
  lists.

• It needs (potentially) multiple blocks. I've put explicit sub-s in the
  example above. Core functions taking a block don't require the sub
  prefix, but only take one block, in a known place (and directly as an
  arg, not nested inside an array).

• It needs a mini-language or options syntax for specifying the
  comparison type and direction. There are core functions with something
  similar, but none that directly seem to apply here:

  » pack/unpack and (s)printf take a (non-optional) initial spec sting
    that's specific to the function; but a single string does it,
    without need for blocks.

  » open takes an optional mode string. That can itself have optional
    layers applied to it. open's API should probably not be copied by
    anything, ever.

  » seek's final argument is the WHENCE indicator, which to be specified
    symbolically needs importing from a separate module. Having to
    import constants from a module isn't a big advantage over using a
    function or function factory from a module.

  » //, s///, and tr/// have suffix option letters, which only work
    because of the specific quote-like syntax of those operators. split
    also uses some of these, but again only because the // syntax
    enables it. There isn't a place to put suffix options on most
    functions.

  Perl doesn't have a general ‘here are some options for this function’
  syntax which can be applied here.

So any new sort routine which avoids the need for $a and $b by only
specifying one of them (with $_) and still has full flexibility for
multiple keys would still need some new syntax which isn't (currently)
used by any other function.

It would be nice to have that. And, if it were sufficiently flexible, it
would avoid the need for separate asort and nsort functions; they'd just
be minimal cases of flexible-key-sort.

But it does make Nicholas's comment from elsewhere in this thread
relevant:

> ... to me it feels like we're now creating a *different* set of special
> case syntax for sort, to eliminate the current special-case syntax ($a, $b).
> What else might re-use this syntax?

Smylers



Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About