develooper Front page | perl.beginners | Postings from August 2009

Inverting a hash safely

Thread Previous | Thread Next
From:
Ed Avis
Date:
August 4, 2009 06:14
Subject:
Inverting a hash safely
Message ID:
loom.20090804T130353-606@post.gmane.org
Shawn H. Corey <shawnhcorey <at> gmail.com> writes:

>>>But then again I never have to invert a hash; when I populate it, I 
>>>would populate its inverse as well.

>>But in the particular case I was thinking of,
>>there was some (programmer-maintained, not user-maintained) configuration data
>>in a hash:
>> 
>>    my %lookup = (Frob => 55, Boo => 66, Grick => 67);
>> 
>>Of course it would be silly to write a %lookup_reverse hash by hand and then
>>worry about keeping the two consistent.  Better to write it once and invert
>>it.
>
>This assumes that programmers never make mistakes.  This source of 
>input, and like all input, should be validated before it is used.

I quite agree.  That's the original problem which prompted me to ask about
a function which checks the hash data when inverting it.  I had omitted to
put in that sanity check, which then bit me later on when I changed the hash
and didn't preserve the one-to-one property.

But I don't think the best way is to write:

    my %lookup = (Frob => 55, Boo => 66, Grick => 67);
    my %lookup_reverse = (55 => 'Frob', 66 => 'Boo', 67 => 'Grick');

Instead, I would write %lookup once and then calculate the inverse.  That's
all I'm saying.

>>Similarly, if you are making a subroutine that takes a hash as input, it is
>>inconvenient to require your caller to pass both the hash and its inverse.
>>Indeed, it creates all sorts of opportunities for bugs when somehow the hash
>>and inverse-hash you are passed aren't consistent with each other.

>The bugs show up because of the lack of validation.  Or are you thinking 
>that some software upstream causes bugs and you're responsible for 
>catching them?

I am saying that if you were writing a CPAN module, for example, and you have
a choice between two interfaces:

    my_function(HASH1, HASH2)
         Calculates the frobble factor of the data.  HASH2 should be the
         inverse of HASH1, that is, keys and values are interchanged.
         If the two hashes are not inverses of each other, then the behaviour
         is undefined.

versus

    my_function(HASH)
         Calculates the frobble factor of the data.

then I would prefer to provide my users with the simpler one, since it gives
less opportunity for things to go wrong.

>>Or if your routine gives a hash as output (such as many XML processing or
>>database modules), it would be seen as a bit weird to return both a hash and
>>an inverse-hash.

>Why on earth would you want to invert an XML file?

What I mean is that many modules (such as XML::Twig) return data as hashes.
To me, it doesn't make any sense to return both a hash and its inverse from
the function.  Instead I would return just one hash.

Sorry if it turns out I was arguing against a straw mean.  I am sure you didn't
really mean to suggest that any code dealing with hashes should always keep
both the hash and its inverse under all circumstances.  So the discussion might
be getting a bit silly.

>As for inverting the return from a database, I would get the database to 
>do it.

Oh, absolutely.

>>So I don't think it is fair to say that inverting a hash is never needed,
>>although you can certainly minimize the need for it if you build your own
>>data structures carefully.

>To be specific, I said I never had to invert a hash since I build all 
>the data structures I need while validating the input.

OK.

-- 
Ed Avis <eda@waniasset.com>




Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About