Front page | perl.perl5.porters |
Postings from August 2021
Re: Robin Hood Hashing for the perl core
From: Ed Avis
August 20, 2021 05:15
Re: Robin Hood Hashing for the perl core
Message ID: DM4PR11MB52471B094E2AFC05C7DCB6079DFF9@DM4PR11MB5247.namprd11.prod.outlook.com
Could I mention one more topic related to hashing, which might turn out to be related to the proposed new implementation. When a reference is used as a hash key, I believe that currently it gets stringified to something like ARRAY(0xabc) and this string is then hashed. (Correct me if I'm wrong.)
For the programmer this is a bit awkward, since you can't get back the original objects by iterating over the keys. You have to use a wrapper like Tie::RefHash, which is slower, and awkward for multidimensional data structures. (There is Tie::RefHash::Nestable for hashes of hashes, but even that doesn't cover a hash of arrays of hashes, and so on.)
But I think it must also be slower. Surely if you have the reference you can treat it as an address (a 64-bit or 32-bit integer) and have a fast way to transform that to a position in the hash data structure. Even if the "address hashing" function ends up being fairly complex, it still has to be faster than converting the address to ASCII text and then hashing the string. Storing the string will also use more memory than storing a pointer plus a tag to say what type of object.
So I am wondering whether, if the hash internals get reworked for greater speed, the treatment of references can also change. Then using references as a hash key would be faster. Moreover, it could be fast to get back the original references rather than a stringified version -- perhaps with a new opcode or new builtin that is like 'keys' but doesn't stringify. Tie::RefHash could change to call that, for compatibility with existing programs, while new programs could use the builtin directly and be even quicker (no tying overhead).
I have asked before whether references could always be preserved in hash keys -- so "Tie::RefHash all of the time". There are some corner cases where this would change the semantics, such as adding to a hash both a string and a reference which stringifies to the same thing, even if you ignore the question of code that explicitly checks ref() on each hash key returned. So for now I'm not advocating this as a global change. However, I think if Perl could provide a fast builtin way to use references as hash keys and get them back unharmed, it would be very useful for new code. Is that something that can be fitted into the new hashing implementation?
This email and any files transmitted with it are CONFIDENTIAL and are intended solely for the use of the individual(s) or entity to whom they are addressed. Any unauthorised copying, disclosure or distribution of the material within this email is strictly forbidden. Any views or opinions presented within this email are solely those of the author and do not necessarily represent those of PGIM Limited, QMA Wadhwani LLP or their affiliates unless otherwise specifically stated. An electronic message is not binding on its sender. Any message referring to a binding agreement must be confirmed in writing and duly signed. If you have received this email in error, please notify the sender immediately and delete the original. Telephone, electronic and other communications and conversations with PGIM Limited, QMA Wadhwani LLP and/or their associated persons may be recorded and retained. PGIM Limited and QMA Wadhwani LLP are authorised and regulated by the Financial Conduct Authority. PGIM Limited (registered in England No. 3809039) has its registered office at Grand Buildings, 1-3 Strand, Trafalgar Square, London WC2N 5HR and QMA Wadhwani LLP (registered in England No. OC303168) has its registered office at 9th Floor, Orion House, 5 Upper St. Martin's Lane, London, England, WC2H 9EA.
Please note that your personal information may be stored and processed in any country where we have facilities or in which we engage service providers. If you provide personal information to us by email or otherwise, you consent to the transfer of that information to countries outside of your country of residence and these countries may have different data protection rules than your country.
To learn about our privacy policies, please use this link<https://www.pgim.com/disclaimer/privacy-center> to read the Privacy Notices.