Front page | perl.perl5.porters |
Postings from October 2016
Re: hv.h hek definition
Thread Previous
|
Thread Next
From:
demerphq
Date:
October 24, 2016 15:45
Subject:
Re: hv.h hek definition
Message ID:
CANgJU+W-BukqwUcsMQ455P8yhiy9r47jNF4Fhko-1sr1MW=u6g@mail.gmail.com
On 24 October 2016 at 13:46, Dave Mitchell <davem@iabyn.com> wrote:
> On Sun, Oct 23, 2016 at 10:23:03AM +0200, demerphq wrote:
>> IMO whether this patch should be merged reduces down to answering the
>> following questions:
>>
>> 1. is an unaligned memeq() slower than an aligned one
>> 2. what *real* effect would come from making the HEK structure 4 bytes larger
>>
>> If the answer to 1 is no, then we can leave hek_flags as a char.
>>
>> If the answer to 2 is "no significant difference" then we can make
>> hek_flags I32 and make sure that hek_key is aligned thus making the
>> answer to 1 irrelevant.
>>
>> So the only circumstances we should not go with your patch is when
>> unaligned memeq is slow, and using an I32 hek_flags would grossly
>> inflate our memory requirements.
>>
>> I suspect that the answer to one of the two questions is favourable to us.
>
> If alignment for a string compare is an issue, the string may need
> aligning on an 8 byte boundary.
>
> My quick playing on an x86_64 system shows no difference in hash lookup
> speeds with a dummy misaligment byte:
>
> struct hek {
> U32 hek_hash; /* hash of key */
> I32 hek_len; /* length of hash key */
> + char hek_pad; /* XXX make string unaligned */
> char hek_key[1]; /* variable-length hash key */
>
> I ran the following code:
>
> my %h;
> @h{aa..zz} = ();
> my $x;
> $x = $h{aasnstnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn} for 1..10_000_000;
>
> However, this may well be different on RISC systems which have strict
> alignment constraints (ARM??).
>
> Note that in the COW implementation, FC put the COW refcount at the end of
> the string for similar alignment reasons.
>
> I've had a vague idea at the back of my mind for a while that it might be
> worth pre-allocating PVX buffers with N leading bytes (and SvOOK set),
> where those N bytes are used to store the COW ref count. They could also
> then be used as part of the HEK structure too. The size of N would depend
> on the platform and alignment constraints, but might be 4 or 8. This
> would add an extra memory overhead for strings, but in practice malloc()
> libraries over allocate anyway (for example on my Linux system,
> malloc/realloc returns (IIRC) 24+16n sized blocks, so strings with length
> 0..16 would be unaffected, strings of length 17..24 would use an extra 16
> bytes, lengths 25..32 unaffected. and so on).
>
> But this is just a vague bit of handwaving.
Well I have pushed the patch. I dont think it should matter, and if we
find it does then we can tweak it further, or revert if necessary.
Yves
--
perl -Mre=debug -e "/just|another|perl|hacker/"
Thread Previous
|
Thread Next