develooper Front page | perl.perl5.porters | Postings from October 2016

Re: hv.h hek definition

Thread Previous | Thread Next
From:
demerphq
Date:
October 25, 2016 16:31
Subject:
Re: hv.h hek definition
Message ID:
CANgJU+UVWOnytKuN2t=OJ0dLFWFLcKuU5HJoi5XfxCBis46-pw@mail.gmail.com
On 25 October 2016 at 17:54, Todd Rinaldo <toddr@cpanel.net> wrote:
>
>> On Oct 24, 2016, at 10:45 AM, demerphq <demerphq@gmail.com> wrote:
>>
>>> ...
>>> However, this may well be different on RISC systems which have strict
>>> alignment constraints (ARM??).
>>>
>>> Note that in the COW implementation, FC put the COW refcount at the end of
>>> the string for similar alignment reasons.
>>>
>>> I've had a vague idea at the back of my mind for a while that it might be
>>> worth pre-allocating PVX buffers with N leading bytes (and SvOOK set),
>>> where those N bytes are used to store the COW ref count. They could also
>>> then be used as part of the HEK structure too. The size of N would depend
>>> on the platform and alignment constraints, but might be 4 or 8. This
>>> would add an extra memory overhead for strings, but in practice malloc()
>>> libraries over allocate anyway (for example on my Linux system,
>>> malloc/realloc returns (IIRC) 24+16n sized blocks, so strings with length
>>> 0..16 would be unaffected, strings of length 17..24 would use an extra 16
>>> bytes, lengths 25..32 unaffected. and so on).
>>>
>>> But this is just a vague bit of handwaving.
>>
>> Well I have pushed the patch. I dont think it should matter, and if we
>> find it does then we can tweak it further, or revert if necessary.
>>
>> Yves
>>
>
> Merged? great!
>
> So I was given the idea of doing this instead of doing char hek_flags, you could instead do this and not have to grow the struct. I suspect this would also fix the byte alignment concern?
>
> PERL_BITFIELD32 hek_len    : 30;
> PERL_BITFIELD32 hek_flags : 2;
>
> You would then need to use an SV for your HEK if your hek_len was > 2^30. It is my belief that it should be an extremely rare event and punishable if abused :)
>
> What I can't divine is what needs changing in hv_common to make it work though from what I can tell it should be related to:
>
> Newx(k, HEK_BASESIZE + sizeof(const SV *), char);
>
> and
>
> HEf_SVKEY
>
> Would this solve the byte alignment issue? Any hints on what would need changing in hv.c?

Honestly, if we have an alignment issue I would solve it by making the
hek_flags an I32. I doubt it will make any practical difference to
memory footprint if we do.

But until we have data that suggests there is one lets not worry about it.

Feel free to set up some benchmarks, probably in pure C, that test how
fast unaligned versus aligned memcmp operations are.

FWIW, I only just noticed that Dave M's benchmark didnt actually
benchmark what we want to benchmark, as the key he was looking for was
not  in the hash, and it is unlikely there was a collision on the hash
value, so it is unlikely we ever executed any memcmps.

We only do a strcmp IIF, there is an item in the bucket,  and the hash
value is the same and the length is the same.

IOW this code:

        if (HeHASH(entry) != hash)              /* strings can't be equal */
            continue;
        if (HeKLEN(entry) != (I32)klen)
            continue;
        if (memNE(HeKEY(entry),key,klen))       /* is this it? */
            continue;
        if ((HeKFLAGS(entry) ^ masked_flags) & HVhek_UTF8)
            continue;

So to test if this has any effect you need to do it on keys that
exists. Or much better just do it as a C program.

BTW, the definition of memNE is as follows:

#ifdef HAS_MEMCMP
#  define memNE(s1,s2,l) (memcmp(s1,s2,l))
#  define memEQ(s1,s2,l) (!memcmp(s1,s2,l))
#else
#  define memNE(s1,s2,l) (bcmp(s1,s2,l))
#  define memEQ(s1,s2,l) (!bcmp(s1,s2,l))
#endif

Yves

-- 
perl -Mre=debug -e "/just|another|perl|hacker/"

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About