develooper Front page | perl.perl5.porters | Postings from April 2008

[patch-blead] parameterize ptr_table_new(table-size)

Thread Next
From:
Jim Cromie
Date:
April 30, 2008 21:15
Subject:
[patch-blead] parameterize ptr_table_new(table-size)
Message ID:
481943C1.8090607@gmail.com

the attached patch;

- parameterizes ptr-table-new( int order ) # size = (2**N-1)
- provides legacy wrapper ( where N == 9 -> table-size == 511 )
- ptr-table-free gets a -DU print "table-size %d"

motivation

- Storable can expose this to users who have a workload that
they can make useful size estimates for freeze operations.

- t/op/regexp_qr_embed_thr.t is ~4 of 53 sec faster when N=14.
This is arguably due to fewer calls to ptr-table-split(),
which has to recompute hashes of all entries and move 1/2 of them.

real    0m57.305s
user    0m39.160s
sys     0m0.622s
[jimc@harpo bleadperl]$

real    0m53.941s
user    0m35.971s
sys     0m1.126s
[jimc@harpo ptr-new-param]$

Note however that it appears to increase sys usage, so its not free.
I suspect this is a common effect of malloc size ?

And what about t/op/regexp_qr_embed_thr.t anyway ?
is it representative / characteristic of any particular workload ?
(ptr-table work doesnt yet count, since its not directly user/XS usable)

its ptr-table workload is noteworthy:
- its the dominant ptr-table user in t/*  (9k/11k -DU prints)
- distinct (BIG,SMALL,)+ pattern

EXECUTING...

ptr-tbl-free tbl-size 5486
ptr-tbl-free tbl-size 8
ptr-tbl-free tbl-size 5486
ptr-tbl-free tbl-size 8

- very tight population clustering on 3 peaks
-- 10 entries     - 1/2 the pop
-- 5.5k           - 3% of pop
-- 10.4k .. 11k   - the size of a perl-clone ?

[jimc@harpo ptr-new-param]$ hist.sv -f2 -q100 foo
bucket quantization: 100
bucket centr, population, avg in bucket
         50:       4864         10
       5450:         24       5496
       5550:        283       5501
      10450:        184      10455
      10650:        960      10623
      10750:        716      10723
      10850:         28      10839
      10950:       1883      10941
      11050:        786      11010
hits: 9728 avg: 5283.347039

Ill guess that the small ptr-table is here in ./ext/threads/threads.xs

    1134             PL_ptr_table = ptr_table_new();
    1135             S_ithread_set(aTHX_ thread);
    1136             /* Ensure 'meaningful' addresses retain their 
meaning */
    1137             ptr_table_store(PL_ptr_table,
			&other_perl->Isv_undef, &PL_sv_undef);
    1138             ptr_table_store(PL_ptr_table, &other_perl->Isv_no,
					&PL_sv_no);
    1139             ptr_table_store(PL_ptr_table, &other_perl->Isv_yes,
					&PL_sv_yes);
    1140             params = (AV *)sv_dup((SV*)params_copy, 							 
&clone_params);
    1141             S_ithread_set(aTHX_ current_thread);
    1142             SvREFCNT_dec(clone_params.stashes);
    1143             SvREFCNT_inc_void(params);
    1144             ptr_table_free(PL_ptr_table);

The small ptr-table size used here commends a parameterized (N=4)
call at 1134. Im not clear why the 3 stores at 1137-9 mismatch with
the 9-10 seen in populations.

TBD, Considered

- actually expose ptr_table_new_N, so Storable, threads can use it.
     this means regen, so likes buy-in 1st

- add tag in ptr-table-new to ID the src, correlate to max_items in -DU

- add -DPTRTBL_NEWSIZE=<N> support
      (probly needed (but ultimately insufficient;), if anyone is gonna
try it)
     -DPTR_TBL_SZ_CLONE   (9, per current code)
     -DPTR_TBL_SZ_THREAD (or just hard-def it to 4, Ill defer to Jerry

- implement a PTBL_NextLarger(PredictedEntries) macro to compute
	2047 for 1800 (anyone wanna offer something for me to drop in ?)

- implement a wrapper / parameter checker
     N in  0..32 -> use PTBL_Order(N)  - much bigger than we'll want,
			20 is practical limit?
          N>32   -> use PTBL_NextLarger(N)
     this is perhaps too much macrobation ;-)



Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About