develooper Front page | perl.perl5.porters | Postings from April 2014

Fwd: [perl.git] branch yves/musings, created. v5.19.10-109-gad4fc4e

Thread Next
From:
demerphq
Date:
April 25, 2014 07:18
Subject:
Fwd: [perl.git] branch yves/musings, created. v5.19.10-109-gad4fc4e
Message ID:
CANgJU+Xdf=UcFhNLtxSpXkwxrpKPBE5XHngbFqddZLLBtz-RZg@mail.gmail.com
Hiya. This is the branch with my current musings and patches waiting
to be merged.

Some of them, the ones with more desriptive messages mostly, are IMO
candidates for immediate merge but I realize that may not be
realistic.

The AES hash patch is an initial attempt at using the AMD/Intel
"aesenc" primitive to do hashing that is secure-ish, and at the same
time fast. Currently it probably isnt that fast for strings under 16
bytes as I haven't figured out how to safely load less than 16 bytes
into the required registers properly. The GO guys have assembly code
do it and I havent figured out how to translate the required bits to C
yet (or how to reuse the assembler) -- I am a bit out of my comfort
zone here and would welcome interest by others.

The AES patch requires -Accflags="-msse2 -maes" in Configure and the
code is not guarded if this option is not used. (Meaning it is as
currently written NOT a candidate for merging)

Here are some relevant links about AES intrinsics:

https://code.google.com/p/go/source/browse/src/pkg/runtime/asm_amd64.s
https://codereview.appspot.com/9123046/diff/11001/src/pkg/runtime/asm_amd64.s#newcode788src/pkg/runtime/asm_amd64.s:788
https://codereview.appspot.com/9123046/diff/11001/src/pkg/runtime/asm_386.s
http://tab.snarc.org/posts/technical/2012-04-12-aes-intrinsics.html
http://cessu.blogspot.nl/2008/11/hashing-with-sse2-revisited-or-my-hash.html
http://www.iacr.org/archive/fse2009/56650054/56650054.pdf
https://software.intel.com/en-us/articles/reducing-the-impact-of-misaligned-memory-accesses

Annoyingy I cant find the one where the Go guys try to explain their
algorithm. OTOH, the code doesnt seem to match the description so
maybe that isn't bad. :-)

MSDN seems to have the most complete online documentation on these intrinsics:

http://msdn.microsoft.com/en-us/library/x7s4chk3(v=vs.90).aspx

Anyway, I just wanted to get this stuff published somewhere so
interested parties could see.

Have a nice day!

cheers,
Yves

---------- Forwarded message ----------
From: Yves Orton <demerphq@gmail.com>
Date: 25 April 2014 09:06
Subject: [perl.git]  branch yves/musings, created. v5.19.10-109-gad4fc4e
To: perl5-changes@perl.org


In perl.git, the branch yves/musings has been created

<http://perl5.git.perl.org/perl.git/commitdiff/ad4fc4e6b43e014c42b35e33323554a18eea477a?hp=0000000000000000000000000000000000000000>

        at  ad4fc4e6b43e014c42b35e33323554a18eea477a (commit)

- Log -----------------------------------------------------------------
commit ad4fc4e6b43e014c42b35e33323554a18eea477a
Author: Yves Orton <demerphq@gmail.com>
Date:   Fri Apr 25 09:06:37 2014 +0200

    switch default to LOOKUP3 for now

    No good reason really.

M       hv_func.h

commit daaaa56e35f8d57612fc127872d343c9fb51d1e2
Author: Yves Orton <demerphq@gmail.com>
Date:   Fri Apr 25 09:01:07 2014 +0200

    Whitespace cleanup

M       hv_func.h

commit 0daf1e2792aa5a3ccc4c8554892498a9f6843078
Author: Yves Orton <demerphq@gmail.com>
Date:   Fri Apr 25 08:59:44 2014 +0200

    Add "aeshash" using SSE2 and AES intrinsics for Intel/AMD

    This is sort of based on Go's hash function. It is not as
    efficient for short strings as it could be, I haven't figured
    out how to do a fast load of less than 16 bytes yet.

M       hv_func.h

commit 310b6a8a4ee33a1841caccd591304fb8d9fc0ac3
Author: Yves Orton <demerphq@gmail.com>
Date:   Fri Apr 25 08:58:45 2014 +0200

    Make it possible to get stats on PL_strtab

M       ext/Hash-Util/Util.xs

commit 626d950c4f5ba35f282add53aa60cfe6976fffba
Author: Yves Orton <demerphq@gmail.com>
Date:   Sun Apr 20 12:52:34 2014 +0200

    hash tinkering

M       hv_func.h
A       hv_lookup3.h

commit 3fba0f000871f98f4b435029d80444904b3dbb73
Author: Yves Orton <demerphq@gmail.com>
Date:   Sun Apr 20 12:44:01 2014 +0200

    make it possible to redefine the "first rand" associated with a hash

    This doesn't change behavior currently, it is hard coded to make
    it possible, the main point is there is now one definition.

M       hv.c

commit c1cebebdf741fe06137f51448577671c543fcda4
Author: Yves Orton <demerphq@gmail.com>
Date:   Sat Apr 19 14:42:47 2014 +0200

    regcomp.c - cleanup the ahocorasick start class logic so it more
self-documenting

    The logic of setting up an AHO-CORASICK regex start class was not fully
    encapsuated in the make_trie_failtable() function, which itself was
    poorly named. Merged the code into make_trie_failtable() and renamed
    it to construct_ahocorasick_from_trie().

M       embed.fnc
M       embed.h
M       proto.h
M       regcomp.c

commit ab6e613d4e80e512d1c02d2b38f12b1ba6f1bc54
Author: Yves Orton <demerphq@gmail.com>
Date:   Sun Apr 13 15:19:21 2014 +0200

    Hash::Util - Add bucket_stats_formatted(), bump version

    Creates reports like this:

        Keys: 500 Buckets: 314/512 Quality-Score: 1.01 (Good)
        Utilized Buckets: 61.33% Optimal: 97.66% Keys In Collision: 37.20%
        Chain Length - mean: 1.59 stddev: 0.81
        Buckets 512
[0000000000000000000000000111111111111111111111122222222222233334]
        Len   0 Pct:  38.67 [#########################]
        Len   1 Pct:  34.57 [######################]
        Len   2 Pct:  19.53 [############]
        Len   3 Pct:   5.47 [####]
        Len   4 Pct:   1.17 [#]
        Len   5 Pct:   0.59 []
        Keys    500
[1111111111111111111111111111111111111111222222222222222222333334]
        Pos   1 Pct:  62.80 [########################################]
        Pos   2 Pct:  27.40 [##################]
        Pos   3 Pct:   7.40 [#####]
        Pos   4 Pct:   1.80 [#]
        Pos   5 Pct:   0.60 []

M       ext/Hash-Util/Changes
M       ext/Hash-Util/lib/Hash/Util.pm

commit 2619a8d523f24ad3e52be6894a99a56e8f9bf484
Author: Yves Orton <demerphq@gmail.com>
Date:   Sun Apr 13 13:31:55 2014 +0200

    Hash::Util - bump version to 0.17

M       ext/Hash-Util/lib/Hash/Util.pm
M       pod/perldelta.pod

commit cf7ed3eeeaf55833b895f694920774f6ad05c97c
Author: Yves Orton <demerphq@gmail.com>
Date:   Sun Apr 13 13:30:09 2014 +0200

    Hash::Util - we should do the mean/stddev on the on the occupied
buckets not all buckets.

    This was always intended to the be average chain-length, which implies
    that empty buckets with no-chains at all are excluded.

M       ext/Hash-Util/lib/Hash/Util.pm

commit 9e120828524726f2eab01ecb5217c239daf66591
Author: Yves Orton <demerphq@gmail.com>
Date:   Sun Apr 13 13:29:44 2014 +0200

    Hash::Util - fix typos in hash_stats() documentation

M       ext/Hash-Util/lib/Hash/Util.pm

commit ba9468d3a9aeddd565358cfc0a3d10dc0d41a76d
Author: Yves Orton <demerphq@gmail.com>
Date:   Sun Apr 13 12:54:12 2014 +0200

    hv_func.h - fix seed initialization in sdbm and djb2 hashing algorithms.

    In a previous commit I added code to "mix in" the length of the
    string into the seed used by these functions, to avoid issues with
    zero seeds, and with the hope that it makes it harder to create
    multicollision attacks against these hash functions.

    Unfortunately when I restructured the seed logic for the inline
    functions in hv_func.h I messed it up, and these hash functions
    were broken. I never noticed because they are both such bad hash
    functions for our needs that I never built with them, and we have
    no infrastructure to make it easy to test building with non-standard
    hash functions so it never got automatically tested. Hopefully
    at some point someone will find a round-tuit and teach Configure
    about selecting alternate hash functions.

M       hv_func.h

commit c5a5de3a2139d788715d6d1623129c5dd4dc6891
Author: Yves Orton <demerphq@gmail.com>
Date:   Fri Mar 21 17:47:45 2014 +0100

    universal.c - utf8::downgrade($x,FAIL_OK) is not supposed to treat
FAIL_OK as an integer

M       pod/perldelta.pod
M       universal.c
-----------------------------------------------------------------------

--
Perl5 Master Repository


-- 
perl -Mre=debug -e "/just|another|perl|hacker/"

Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About