Front page | perl.perl5.porters |
Postings from May 2008
Re: [patch-blead] parameterize ptr_table_new(table-size)
Thread Previous
From:
Nicholas Clark
Date:
May 4, 2008 11:01
Subject:
Re: [patch-blead] parameterize ptr_table_new(table-size)
Message ID:
20080504180059.GX84051@plum.flirble.org
On Wed, Apr 30, 2008 at 10:14:57PM -0600, Jim Cromie wrote:
>
> the attached patch;
>
> - parameterizes ptr-table-new( int order ) # size = (2**N-1)
> - provides legacy wrapper ( where N == 9 -> table-size == 511 )
> - ptr-table-free gets a -DU print "table-size %d"
I'm uncomfortable with this. I appreciate the intent, but to me it feels like
adding a bit more complexity for minimal gain. We keep adding complexity*,
which increases our maintenence cost, and increases the number of places bugs
might hide.
* (except the odd times when me or someone else manages to simplify the code)
> motivation
>
> - Storable can expose this to users who have a workload that
> they can make useful size estimates for freeze operations.
Yes, but I suspect that that is rare, and I'm not sure how many people would
know what parameters to tweak.
> - t/op/regexp_qr_embed_thr.t is ~4 of 53 sec faster when N=14.
> This is arguably due to fewer calls to ptr-table-split(),
> which has to recompute hashes of all entries and move 1/2 of them.
> real 0m57.305s
> user 0m39.160s
> sys 0m0.622s
> [jimc@harpo bleadperl]$
>
> real 0m53.941s
> user 0m35.971s
> sys 0m1.126s
> [jimc@harpo ptr-new-param]$
>
> Note however that it appears to increase sys usage, so its not free.
> I suspect this is a common effect of malloc size ?
I don't know. I'm not sure why using more memory up front would increase sys
usage, unless sys usage goes up as a side effect of swapping.
> And what about t/op/regexp_qr_embed_thr.t anyway ?
> is it representative / characteristic of any particular workload ?
> (ptr-table work doesnt yet count, since its not directly user/XS usable)
>
> its ptr-table workload is noteworthy:
> - its the dominant ptr-table user in t/* (9k/11k -DU prints)
> - distinct (BIG,SMALL,)+ pattern
I'm also not comfortable with using one regression test as the thing to tune
something fairly special the implementation. Although I admit that sometimes
I'm guilty of using *all* the regression tests as way of timing something
fairly general (like hash lookup or SV allocation).
Is there more "typical" code that could be made into a benchmark for perlbench
(or somesuch) and used to provide a firm perl version independent way of
tuning?
> EXECUTING...
>
> ptr-tbl-free tbl-size 5486
> ptr-tbl-free tbl-size 8
> ptr-tbl-free tbl-size 5486
> ptr-tbl-free tbl-size 8
>
> - very tight population clustering on 3 peaks
> -- 10 entries - 1/2 the pop
> -- 5.5k - 3% of pop
> -- 10.4k .. 11k - the size of a perl-clone ?
Interesting data. Thanks for this.
Would it not be easier to experiment with changing the default N in all cases
in ptr_table_new ?
> TBD, Considered
>
> - actually expose ptr_table_new_N, so Storable, threads can use it.
> this means regen, so likes buy-in 1st
>
> - add tag in ptr-table-new to ID the src, correlate to max_items in -DU
>
> - add -DPTRTBL_NEWSIZE=<N> support
> (probly needed (but ultimately insufficient;), if anyone is gonna
> try it)
> -DPTR_TBL_SZ_CLONE (9, per current code)
> -DPTR_TBL_SZ_THREAD (or just hard-def it to 4, Ill defer to Jerry
>
> - implement a PTBL_NextLarger(PredictedEntries) macro to compute
> 2047 for 1800 (anyone wanna offer something for me to drop in ?)
>
> - implement a wrapper / parameter checker
> N in 0..32 -> use PTBL_Order(N) - much bigger than we'll want,
> 20 is practical limit?
> N>32 -> use PTBL_NextLarger(N)
> this is perhaps too much macrobation ;-)
>
Or change the default to (say) 16 (rather than 512), and make the split go
at 16, so 16/256/4096/65536, or (say) 32 and 8, so 32/256/2048/16384
That would (also) fit in with your 3 peaks, yet not require any external
code changes.
Also, is the hash used by the ptr_table code optimal? Does it split between
buckets optimally? I believe that I changed it most recently, based on
instrumenting the ptr_table functions, and figuring out that the previous hash
was not that efficient at partitioning. If the current hash is leaving some
table slots empty, then finding a better hash might be a better win than
external code changes.
I'd feel a lot more comfortable with tweaks that don't increase the user
visible API.
Nicholas Clark
Thread Previous