develooper Front page | perl.perl5.porters | Postings from August 2021

Re: Robin Hood Hashing for the perl core

Thread Previous | Thread Next
From:
demerphq
Date:
August 6, 2021 09:35
Subject:
Re: Robin Hood Hashing for the perl core
Message ID:
CANgJU+W37K-p0zwGKsmSzjuwC_hqSBSVxg_9Tr68YgGfcoJz9A@mail.gmail.com
On Fri, 6 Aug 2021 at 10:07, Nicholas Clark <nick@ccl4.org> wrote:

> On Fri, Aug 06, 2021 at 12:41:56AM -0700, Darren Duncan wrote:
> > On 2021-08-06 12:20 a.m., Martijn Lievaart wrote:
> > >
> > > Op 06-08-2021 om 03:13 schreef Darren Duncan:
> > > > On 2021-08-04 1:01 a.m., Smylers wrote:
> > > > > Let's debate a specific proposal on its own merits, and not
> discard it,
> > > > > or derail the discussion, in the hope of something bigger and
> vaguer
> > > > > that possibly could happen in the future.
> > > >
> > > > I fully agree.
> > > >
> > > > As the saying goes, the perfect is the enemy of the good.
>
> As I wrote in my orginal message:
>
>     We can in turn replace this with something even better once code for
> that
>     exists. But until then, don't let the perfect be the enemy of the good.
>
> +1


> > > > The current discussion should just focus on whether the current
> > > > proposed new solution is significantly better than what Perl has now
> > > > and thus whether Perl switching to it would be an improvement.
> > > >
> > > > Consideration of other algorithms can be their own separate
> > > > discussions if or when someone comes up with a specific example and
> > > > makes a case it is better than the winner of the current discussion,
> > > > including that someone has already committed to do the work etc.
>
> Yes, exactly that.
>
> > > Yes, but a question that may be asked is, can this be generalized to
> > > support different algorithms easily, so programmers can choose the
> > > hashing algorithm at runtime. And if it can, is it worth to take such a
> > > broadening of the scope on at the same time as handling the current
> > > proposal. If it is easy to implement, it should be looked at.
> > >
> > > I suspect the answer to this is a firm no, but it never hurts to look
> at
> > > this. Anyone who knows the internals who can say something about that?
>
> 1) I am not confident that it's easy to implement
> 2) *I* am not in a position to attempt it
> 3) I'm not in a position to mentor someone else who might attempt it
> 4) I doubt I'd even be in a position to review the code if someone did
>    attempt it.
>    (I estimate that it even if someone else started now, it would take them
>     a couple of months at least, and I by then I won't have as much free
> time
>     as I currently have)
>
> I'm not stopping anyone doing this, but based on previous experience of
> how often large good quality surprise contributions arrive, I doubt that
> it is going to happen.
>
>
I dont see that we need it either. I can understand why someone might want
to swap out the hash function, but not the hash table.


> > I would say that at this stage the best argument for having that feature
> of
> > hashing algorithm choice is:
> >
> > 1. If it turns out from measuring that some common use cases perform way
> > better with the current algorithm and others perform way better with the
> new
> > one, and users would then benefit from being able to choose to tune their
> > scenarios.
> >
> > 2. Also implementing the ability to choose isn't too much effort or has
> too
> > poor trade-offs of is own.
> >
> > 3. If going forward the added maintenance cost of two algorithms rather
> than
> > one is worth it.
>
> That's a good summary.
>
> We have finite programmer time.
>
> (eg - I think all but two messages in this thread so far have been from
> people who can consider designs and trade offs, but can't review the C
> code)
>
> If we want to maintain two (or three) effectively we'd be stealing from
> one to feed the other(s).
>
>
As the sereal patch shows it also cascades into CPAN.


> Much like the previous PSC opinion on sort, I'm going to extrapolate and
> suggest that it's only better to have 2+, if it's clear that a significant
> amount of the user base are actually going to actively *need* to use
> different hashing algorithms.
>
> But if all we really end up with is an "optional feature" that rarely
> anyone uses (because it costs them more to learn about it and evaluate it
> than they gain by choosing it), then we've wasted effort that could have
> been spent better elsewhere.
>
>
> So if someone else steps up an implements another hash, great.
>

From the POV of Sereal and CPAN I would not say "great" unless it truly
could be swapped out without Sereal breaking.

cheers,
Yves

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About