develooper Front page | perl.perl5.porters | Postings from April 2007

Re: Performance problems with Hash::Util::FieldHash

Thread Previous | Thread Next
Jerry D. Hedden
April 19, 2007 13:03
Re: Performance problems with Hash::Util::FieldHash
Message ID:
First of all, let me say that I am not knocking HUF.  I
generated this discussion because I think HUF is a good
idea, and wanted to use it.  I know that speed isn't
everything, but if it could be improved then everybody wins.
That being said...

Jerry D. Hedden wrote:
> While field hashes do a bit better than refaddr in get
> operations, stringification is the hands down winner!  Why?

I figured out why set operations are so slow.  They actually
do the equivalent of 2 refaddr calls - one for the ref being
used as a key and the other for the field hash which is then
stored in the object registry.

Jerry D. Hedden wrote:
> I tried to speed things up a bit by borrowing the trick from
> blead 28961, and replaced HUF_id with
>    SV* HUF_id(SV* ref, NV cookie) {
>        UV id = PTR2UV(SvRV(ref));
>        return sv_2mortal(newSVpvn((char *)&id, sizeof(id)));
>    }
> This only have about a 10% improvement:

Anno Siegel replied:
> Yes, that's why I don't use this optimization in preferance to
> printable keys that are compatible with refaddr()-generated
> ones.

I see where you're coming from for HUF_obj_id as that gets
stored in the field hash which is accessible with user code.

This could be made optional via a runtime flag.  For example:

    $Hash::Util::FieldHash::binary_keys = 1;

Regardless, HUF_field_id is only used inside the data
structures stored in the object registry.  There is no need
for that to be printable.

In fact, there is no need to have HUF_field_id at all.
Since the object registry is not a field hash, you can just
use the field ref as the key and the value.

    void HUF_mark_field(SV* trigger, SV* field) {
        AV* cont = HUF_get_trigger_content(trigger);
        HV* field_tab = (HV*) *av_fetch(cont, 1, 0);
        SV* field_ref = newRV_inc(field);
-       SV* field_id = HUF_field_id(field_ref);
-       hv_store_ent(field_tab, field_id, field_ref, 0);
+       hv_store_ent(field_tab, field_ref, field_ref, 0);

(You'd do something similar in HUF_fix_trigger.)

This change adds 25% performance to set operations!

              Rate FieldHash   Refaddr Stringify
FieldHash  99211/s        --      -36%      -69%
Refaddr   156236/s       57%        --      -51%
Stringify 317021/s      220%      103%        --

(And the HUF_id hack I mentioned would add another 5%.)

Anno Siegel replied:
> There is a speed hack that brings the performance of
> field hashes to better-than-refaddr, but not as fast as
> stringification (on my machine).  I'm not using it because
> it changes the behavior in a subtle way.  (Garbage
> collection could be bypassed if you use the refaddr of
> an object as a *string* hash key before first presenting
> the object as such.  That won't happen in the intended
> usage, but it's a weak point.)

That sounds interesting.  Again, perhaps it could be added
as an optional feature.  (Would you mind sending me the code
so I could marvel at it?  Thanks.)

Another idea I have would be to 'cache' the SV from
HUF_obj_id onto the object using PERL_MAGIC_ext magic, and
then just reusing it each time.  For the field id, you could
similarly magic the strigified ref, and reuse it, especially
in set ops where you could do hv_exists first before
resorting to the sv_store.  And, of course, there may be
other ways to speed things up.

Also, I would suggest making Hash::Util::FieldHash a
dual-lived module, so that new releases can get out to users
via CPAN.

Again, I am enthused about HUF, and would like to contribute
to its continued development.  Thanks.

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About