develooper Front page | perl.perl5.porters | Postings from May 2008

Re: [patch-blead] parameterize ptr_table_new(table-size)

Thread Previous
From:
Nicholas Clark
Date:
May 4, 2008 11:01
Subject:
Re: [patch-blead] parameterize ptr_table_new(table-size)
Message ID:
20080504180059.GX84051@plum.flirble.org
On Wed, Apr 30, 2008 at 10:14:57PM -0600, Jim Cromie wrote:
> 
> the attached patch;
> 
> - parameterizes ptr-table-new( int order ) # size = (2**N-1)
> - provides legacy wrapper ( where N == 9 -> table-size == 511 )
> - ptr-table-free gets a -DU print "table-size %d"

I'm uncomfortable with this. I appreciate the intent, but to me it feels like
adding a bit more complexity for minimal gain. We keep adding complexity*,
which increases our maintenence cost, and increases the number of places bugs
might hide.

* (except the odd times when me or someone else manages to simplify the code)

> motivation
> 
> - Storable can expose this to users who have a workload that
> they can make useful size estimates for freeze operations.

Yes, but I suspect that that is rare, and I'm not sure how many people would
know what parameters to tweak.

> - t/op/regexp_qr_embed_thr.t is ~4 of 53 sec faster when N=14.
> This is arguably due to fewer calls to ptr-table-split(),
> which has to recompute hashes of all entries and move 1/2 of them.

> real    0m57.305s
> user    0m39.160s
> sys     0m0.622s
> [jimc@harpo bleadperl]$
> 
> real    0m53.941s
> user    0m35.971s
> sys     0m1.126s
> [jimc@harpo ptr-new-param]$
> 
> Note however that it appears to increase sys usage, so its not free.
> I suspect this is a common effect of malloc size ?

I don't know. I'm not sure why using more memory up front would increase sys
usage, unless sys usage goes up as a side effect of swapping.

> And what about t/op/regexp_qr_embed_thr.t anyway ?
> is it representative / characteristic of any particular workload ?
> (ptr-table work doesnt yet count, since its not directly user/XS usable)
> 
> its ptr-table workload is noteworthy:
> - its the dominant ptr-table user in t/*  (9k/11k -DU prints)
> - distinct (BIG,SMALL,)+ pattern

I'm also not comfortable with using one regression test as the thing to tune
something fairly special the implementation. Although I admit that sometimes
I'm guilty of using *all* the regression tests as way of timing something
fairly general (like hash lookup or SV allocation).

Is there more "typical" code that could be made into a benchmark for perlbench
(or somesuch) and used to provide a firm perl version independent way of
tuning?

> EXECUTING...
> 
> ptr-tbl-free tbl-size 5486
> ptr-tbl-free tbl-size 8
> ptr-tbl-free tbl-size 5486
> ptr-tbl-free tbl-size 8
> 
> - very tight population clustering on 3 peaks
> -- 10 entries     - 1/2 the pop
> -- 5.5k           - 3% of pop
> -- 10.4k .. 11k   - the size of a perl-clone ?

Interesting data. Thanks for this.

Would it not be easier to experiment with changing the default N in all cases
in ptr_table_new ?

> TBD, Considered
> 
> - actually expose ptr_table_new_N, so Storable, threads can use it.
>     this means regen, so likes buy-in 1st
> 
> - add tag in ptr-table-new to ID the src, correlate to max_items in -DU
> 
> - add -DPTRTBL_NEWSIZE=<N> support
>      (probly needed (but ultimately insufficient;), if anyone is gonna
> try it)
>     -DPTR_TBL_SZ_CLONE   (9, per current code)
>     -DPTR_TBL_SZ_THREAD (or just hard-def it to 4, Ill defer to Jerry
> 
> - implement a PTBL_NextLarger(PredictedEntries) macro to compute
> 	2047 for 1800 (anyone wanna offer something for me to drop in ?)
> 
> - implement a wrapper / parameter checker
>     N in  0..32 -> use PTBL_Order(N)  - much bigger than we'll want,
> 			20 is practical limit?
>          N>32   -> use PTBL_NextLarger(N)
>     this is perhaps too much macrobation ;-)
> 

Or change the default to (say) 16 (rather than 512), and make the split go
at 16, so 16/256/4096/65536, or (say) 32 and 8, so 32/256/2048/16384
That would (also) fit in with your 3 peaks, yet not require any external
code changes.

Also, is the hash used by the ptr_table code optimal? Does it split between
buckets optimally? I believe that I changed it most recently, based on
instrumenting the ptr_table functions, and figuring out that the previous hash
was not that efficient at partitioning. If the current hash is leaving some
table slots empty, then finding a better hash might be a better win than
external code changes.

I'd feel a lot more comfortable with tweaks that don't increase the user
visible API.

Nicholas Clark

Thread Previous


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About