Front page | perl.perl5.porters |
Postings from September 2002
From: 'Brian Ingerson'
September 30, 2002 15:17
Message ID: 20020930151654.C15254@ttul.org
On 30/09/02 21:19 +0100, Orton, Yves wrote:
> Im confused how you reconcile this with your other post suggesting that MD5
> or some equivelent digest be used as a memory conservation. If it used
> MD5's then there would be no way to reconstruct the original representation
> from the key. And if the hash stored a pointer to a given object with the
> appropriate data structure and that structure changed after the MD5 was
> calculated the rendering would be incorrect. But if the entire serialized
> structure was stored you end up with serious memory issues as you already
> pointed out.
OK. Start from the beginning:
- Hash keys need to be immutable (readonly, unchanging).
- Python does this. You can't use an array as a hashkey, only a tuple
(which is immutable)
- Ruby says, you can use a mutable value, but at the point which you
use it, it will cache the hasing value and consider it immutable. If
you break that contract, you need to let ruby know by calling the
So here is my idea for how to do this in Perl:
- In the hashing opcodes, if an SV is used as a key and it has ROK,
then you call the internal HASH (storable->md5 or whatever) routine
and save this information in a special mapping of pointers->digests.
- Now, every time you use the ref as a hash key, it will look up the
same digest and use that to index the hash internally.
- This is similar to the fact that all of the strings used as hash
keys in a Perl interpreter are stored in one master internal hash.
- The nice thing semantically is that two "equal" but non identical
structures will have the same digest, and so will hash the same.
So to answer your questions:
- We don't need to reconstruct anything. The hashing structures have
not gone away. They're still in memory. Right?
- The digest operation gets cached so there's always a unique data
structure to digest relationship.
- Using digests, we pay a *relatively* small memory loss.
> Then theres the problem of how do you tell some of the more bizarre data
> structures apart?
> Would ['5','3'] be equivelent to [5,3] for example?
Well that's a matter of debate. Should Perl distinguish semantically between
integers and strings. I'm not going to touch that here.
> What about aliases and complex self referential structures?
What's the problem? Storable can handle these. No? As long as we can produce
a canonical digest it should work fairly well.