Front page | perl.perl5.porters |
Postings from June 2009
[patch] add Perl_ptr_table_new(initsize) Was: 5.12 release/roadmap?
Thread Next
From:
Jim Cromie
Date:
June 2, 2009 16:14
Subject:
[patch] add Perl_ptr_table_new(initsize) Was: 5.12 release/roadmap?
Message ID:
cfe85dfa0906021614g747fda59n624a59dcfbee9f6f@mail.gmail.com
On Thu, May 28, 2009 at 7:40 AM, Richard Foley <Richard.Foley@rfi.net>wrote:
> Is there a 5.12 release plan?
>
> If so, what is it, please ?-)
>
My particular 5.12 Itch :
1- add Perl_ptr_table_new_n (init_size) to API
2- patch Storable to use it, if Storable user wants to estimate the size of
their need.
Heres why:
A Ive done 1,2 already
1 is a simpler version of my recent rfc patch
http://www.nntp.perl.org/group/perl.perl5.porters/2009/05/msg146734.html
- it does initial-size only, no Growth, no PTR_TBL_t changes
- now init_size is linear, not the 2**N size of the table
-- easier to use, less prone to sucking all the memory.
2 my Demo-Only hack to Storable
xs reads $Storable::Size in init_store_context(),
if thats set to something, its an active presize, and is used.
at end of freeze, before ptr-table is freed
if same Storable::Size is set, we read the ptr-table, and sets
Storable::Sizeout
3 my benchmark script shows ..
[jimc@harpo foo1]$ ./perl -Ilib ../storit.pl -x100 -y100 -r10 -b -s32
# x*y -> guess 10000 ~~> 2**13
# 320000 -> 18.2877123795495
Benchmark: running guess@2^18, legacy@2^18 for at least 3 CPU seconds...
guess@2^18: 4 wallclock secs ( 3.84 usr + 0.16 sys = 4.00 CPU) @
9.50/s (n=38)
legacy@2^18: 3 wallclock secs ( 3.04 usr + 0.06 sys = 3.10 CPU) @
6.45/s (n=20)
Rate legacy@2^18 guess@2^18
legacy@2^18 6.45/s -- -32%
guess@2^18 9.50/s 47% --
# 640000 -> 19.2877123795495
Benchmark: running guess@2^19, legacy@2^19 for at least 3 CPU seconds...
guess@2^19: 4 wallclock secs ( 3.11 usr + 0.12 sys = 3.23 CPU) @
4.02/s (n=13)
legacy@2^19: 3 wallclock secs ( 3.02 usr + 0.06 sys = 3.08 CPU) @
2.60/s (n=8)
Rate legacy@2^19 guess@2^19
legacy@2^19 2.60/s -- -35%
guess@2^19 4.02/s 55% --
# 960000 -> 19.8726748802706
Benchmark: running guess@2^19, legacy@2^19 for at least 3 CPU seconds...
guess@2^19: 3 wallclock secs ( 3.41 usr + 0.09 sys = 3.50 CPU) @
2.57/s (n=9)
legacy@2^19: 3 wallclock secs ( 3.31 usr + 0.06 sys = 3.37 CPU) @
1.78/s (n=6)
Rate legacy@2^19 guess@2^19
legacy@2^19 1.78/s -- -31%
guess@2^19 2.57/s 44% --
# 1280000 -> 20.2877123795495
Benchmark: running guess@2^20, legacy@2^20 for at least 3 CPU seconds...
guess@2^20: 3 wallclock secs ( 3.10 usr + 0.11 sys = 3.21 CPU) @
1.87/s (n=6)
legacy@2^20: 4 wallclock secs ( 3.48 usr + 0.06 sys = 3.54 CPU) @
1.13/s (n=4)
Rate legacy@2^20 guess@2^20
legacy@2^20 1.13/s -- -40%
guess@2^20 1.87/s 65% --
# 1600000 -> 20.6096404744368
Benchmark: running guess@2^20, legacy@2^20 for at least 3 CPU seconds...
guess@2^20: 4 wallclock secs ( 3.31 usr + 0.10 sys = 3.41 CPU) @
1.47/s (n=5)
legacy@2^20: 3 wallclock secs ( 3.09 usr + 0.06 sys = 3.15 CPU) @
0.95/s (n=3)
(warning: too few iterations for a reliable count)
s/iter legacy@2^20 guess@2^20
legacy@2^20 1.05 -- -35%
guess@2^20 0.682 54% --
# 1920000 -> 20.8726748802706
Benchmark: running guess@2^20, legacy@2^20 for at least 3 CPU seconds...
guess@2^20: 4 wallclock secs ( 3.38 usr + 0.10 sys = 3.48 CPU) @
1.15/s (n=4)
legacy@2^20: 4 wallclock secs ( 3.69 usr + 0.06 sys = 3.75 CPU) @
0.80/s (n=3)
(warning: too few iterations for a reliable count)
s/iter legacy@2^20 guess@2^20
legacy@2^20 1.25 -- -30%
guess@2^20 0.870 44% --
# 2240000 -> 21.0950673016071
Benchmark: running guess@2^21, legacy@2^21 for at least 3 CPU seconds...
guess@2^21: 4 wallclock secs ( 3.73 usr + 0.13 sys = 3.86 CPU) @
1.04/s (n=4)
legacy@2^21: 4 wallclock secs ( 3.28 usr + 0.07 sys = 3.35 CPU) @
0.60/s (n=2)
(warning: too few iterations for a reliable count)
s/iter legacy@2^21 guess@2^21
legacy@2^21 1.67 -- -42%
guess@2^21 0.965 74% --
# 2560000 -> 21.2877123795495
Benchmark: running guess@2^21, legacy@2^21 for at least 3 CPU seconds...
guess@2^21: 3 wallclock secs ( 3.32 usr + 0.10 sys = 3.42 CPU) @
0.88/s (n=3)
(warning: too few iterations for a reliable count)
legacy@2^21: 4 wallclock secs ( 3.61 usr + 0.08 sys = 3.69 CPU) @
0.54/s (n=2)
(warning: too few iterations for a reliable count)
s/iter legacy@2^21 guess@2^21
legacy@2^21 1.85 -- -38%
guess@2^21 1.14 62% --
# 2880000 -> 21.4576373809918
Benchmark: running guess@2^21, legacy@2^21 for at least 3 CPU seconds...
guess@2^21: 4 wallclock secs ( 3.74 usr + 0.11 sys = 3.85 CPU) @
0.78/s (n=3)
(warning: too few iterations for a reliable count)
legacy@2^21: 4 wallclock secs ( 3.89 usr + 0.08 sys = 3.97 CPU) @
0.50/s (n=2)
(warning: too few iterations for a reliable count)
s/iter legacy@2^21 guess@2^21
legacy@2^21 1.99 -- -35%
guess@2^21 1.28 55% --
# 3200000 -> 21.6096404744368
Benchmark: running guess@2^21, legacy@2^21 for at least 3 CPU seconds...
guess@2^21: 4 wallclock secs ( 4.24 usr + 0.12 sys = 4.36 CPU) @
0.69/s (n=3)
(warning: too few iterations for a reliable count)
legacy@2^21: 5 wallclock secs ( 4.29 usr + 0.08 sys = 4.37 CPU) @
0.46/s (n=2)
(warning: too few iterations for a reliable count)
s/iter legacy@2^21 guess@2^21
legacy@2^21 2.19 -- -33%
guess@2^21 1.45 50% --
# took 109 samples
# avg size 1027112.93
1..1
ok 1
Those are non-trivial performance improvements.
- this is an application relevant benchmark.
- Storable heavy apps will benefit.
- Storable workloads are poster-child of unpredictability
-- except at the user/app level
-- many users will have an predicatable loads
-- estimatable from size of previously stored data.
- total reps is low (n=3), indicating where time was spent.
For Small PTR-Tables, things are much tighter
- no benefit to guessing below 2**9
-- this seems sensible, alloc still done, 512 byte slabs are cheap.
-- most samples below 512 show ~ 1-2% loss vs legacy
- for tables > 512 - guessing looks 10%
[jimc@harpo foo1]$ ./perl -Ilib ../storit.pl -x5 -y10 -r20 -b5
# x*y -> guess 50 ~~> 2**5
# 50 -> 5.64385618977472
Benchmark: running guess@2^5, legacy@2^5 for at least 5 CPU seconds...
guess@2^5: 5 wallclock secs ( 5.28 usr + 0.00 sys = 5.28 CPU) @
26893.37/s (n=141997)
legacy@2^5: 6 wallclock secs ( 5.24 usr + 0.00 sys = 5.24 CPU) @
27479.01/s (n=143990)
Rate guess@2^5 legacy@2^5
guess@2^5 26893/s -- -2%
legacy@2^5 27479/s 2% --
# 100 -> 6.64385618977473
Benchmark: running guess@2^6, legacy@2^6 for at least 5 CPU seconds...
guess@2^6: 5 wallclock secs ( 5.25 usr + 0.00 sys = 5.25 CPU) @
21763.43/s (n=114258)
legacy@2^6: 6 wallclock secs ( 5.24 usr + 0.01 sys = 5.25 CPU) @
22122.10/s (n=116141)
Rate guess@2^6 legacy@2^6
guess@2^6 21763/s -- -2%
legacy@2^6 22122/s 2% --
# 150 -> 7.22881869049588
Benchmark: running guess@2^7, legacy@2^7 for at least 5 CPU seconds...
guess@2^7: 5 wallclock secs ( 5.25 usr + 0.00 sys = 5.25 CPU) @
18100.00/s (n=95025)
legacy@2^7: 6 wallclock secs ( 5.22 usr + 0.00 sys = 5.22 CPU) @
18547.32/s (n=96817)
Rate guess@2^7 legacy@2^7
guess@2^7 18100/s -- -2%
legacy@2^7 18547/s 2% --
# 200 -> 7.64385618977472
Benchmark: running guess@2^7, legacy@2^7 for at least 5 CPU seconds...
guess@2^7: 7 wallclock secs ( 5.38 usr + 0.02 sys = 5.40 CPU) @
15170.19/s (n=81919)
legacy@2^7: 6 wallclock secs ( 5.24 usr + 0.01 sys = 5.25 CPU) @
15999.81/s (n=83999)
Rate guess@2^7 legacy@2^7
guess@2^7 15170/s -- -5%
legacy@2^7 16000/s 5% --
# 250 -> 7.96578428466209
Benchmark: running guess@2^7, legacy@2^7 for at least 5 CPU seconds...
guess@2^7: 6 wallclock secs ( 5.26 usr + 0.01 sys = 5.27 CPU) @
12397.15/s (n=65333)
legacy@2^7: 6 wallclock secs ( 5.27 usr + 0.01 sys = 5.28 CPU) @
13882.39/s (n=73299)
Rate guess@2^7 legacy@2^7
guess@2^7 12397/s -- -11%
legacy@2^7 13882/s 12% --
# 300 -> 8.22881869049588
Benchmark: running guess@2^8, legacy@2^8 for at least 5 CPU seconds...
guess@2^8: 6 wallclock secs ( 5.25 usr + 0.00 sys = 5.25 CPU) @
12218.10/s (n=64145)
legacy@2^8: 5 wallclock secs ( 5.27 usr + 0.01 sys = 5.28 CPU) @
12373.67/s (n=65333)
Rate guess@2^8 legacy@2^8
guess@2^8 12218/s -- -1%
legacy@2^8 12374/s 1% --
# 350 -> 8.45121111183233
Benchmark: running guess@2^8, legacy@2^8 for at least 5 CPU seconds...
guess@2^8: 5 wallclock secs ( 5.28 usr + 0.01 sys = 5.29 CPU) @
10977.32/s (n=58070)
legacy@2^8: 5 wallclock secs ( 5.31 usr + 0.00 sys = 5.31 CPU) @
11142.37/s (n=59166)
Rate guess@2^8 legacy@2^8
guess@2^8 10977/s -- -1%
legacy@2^8 11142/s 2% --
# 400 -> 8.64385618977473
Benchmark: running guess@2^8, legacy@2^8 for at least 5 CPU seconds...
guess@2^8: 6 wallclock secs ( 5.29 usr + 0.00 sys = 5.29 CPU) @
10064.65/s (n=53242)
legacy@2^8: 5 wallclock secs ( 5.23 usr + 0.01 sys = 5.24 CPU) @
10160.69/s (n=53242)
Rate guess@2^8 legacy@2^8
guess@2^8 10065/s -- -1%
legacy@2^8 10161/s 1% --
# 450 -> 8.81378119121704
Benchmark: running guess@2^8, legacy@2^8 for at least 5 CPU seconds...
guess@2^8: 6 wallclock secs ( 5.32 usr + 0.00 sys = 5.32 CPU) @
8187.03/s (n=43555)
legacy@2^8: 6 wallclock secs ( 5.27 usr + 0.01 sys = 5.28 CPU) @
8249.05/s (n=43555)
Rate guess@2^8 legacy@2^8
guess@2^8 8187/s -- -1%
legacy@2^8 8249/s 1% --
# 500 -> 8.96578428466209
Benchmark: running guess@2^8, legacy@2^8 for at least 5 CPU seconds...
guess@2^8: 5 wallclock secs ( 5.30 usr + 0.00 sys = 5.30 CPU) @
7585.09/s (n=40201)
legacy@2^8: 6 wallclock secs ( 5.33 usr + 0.00 sys = 5.33 CPU) @
7684.62/s (n=40959)
Rate guess@2^8 legacy@2^8
guess@2^8 7585/s -- -1%
legacy@2^8 7685/s 1% --
# 550 -> 9.10328780841202
Benchmark: running guess@2^9, legacy@2^9 for at least 5 CPU seconds...
guess@2^9: 5 wallclock secs ( 5.33 usr + 0.01 sys = 5.34 CPU) @
7817.79/s (n=41747)
legacy@2^9: 6 wallclock secs ( 5.34 usr + 0.00 sys = 5.34 CPU) @
7121.16/s (n=38027)
Rate legacy@2^9 guess@2^9
legacy@2^9 7121/s -- -9%
guess@2^9 7818/s 10% --
# 600 -> 9.22881869049588
Benchmark: running guess@2^9, legacy@2^9 for at least 5 CPU seconds...
guess@2^9: 5 wallclock secs ( 5.25 usr + 0.00 sys = 5.25 CPU) @
7245.14/s (n=38037)
legacy@2^9: 5 wallclock secs ( 5.24 usr + 0.00 sys = 5.24 CPU) @
6647.71/s (n=34834)
Rate legacy@2^9 guess@2^9
legacy@2^9 6648/s -- -8%
guess@2^9 7245/s 9% --
# 650 -> 9.34429590791582
Benchmark: running guess@2^9, legacy@2^9 for at least 5 CPU seconds...
guess@2^9: 6 wallclock secs ( 5.38 usr + 0.01 sys = 5.39 CPU) @
5193.32/s (n=27992)
legacy@2^9: 6 wallclock secs ( 5.31 usr + 0.01 sys = 5.32 CPU) @
6256.20/s (n=33283)
Rate guess@2^9 legacy@2^9
guess@2^9 5193/s -- -17%
legacy@2^9 6256/s 20% --
# 700 -> 9.45121111183233
Benchmark: running guess@2^9, legacy@2^9 for at least 5 CPU seconds...
guess@2^9: 5 wallclock secs ( 5.40 usr + 0.02 sys = 5.42 CPU) @
6258.86/s (n=33923)
legacy@2^9: 6 wallclock secs ( 5.02 usr + 0.01 sys = 5.03 CPU) @
4685.09/s (n=23566)
Rate legacy@2^9 guess@2^9
legacy@2^9 4685/s -- -25%
guess@2^9 6259/s 34% --
# 750 -> 9.55074678538324
Benchmark: running guess@2^9, legacy@2^9 for at least 5 CPU seconds...
guess@2^9: 5 wallclock secs ( 5.23 usr + 0.04 sys = 5.27 CPU) @
5943.26/s (n=31321)
legacy@2^9: 6 wallclock secs ( 5.86 usr + 0.01 sys = 5.87 CPU) @
5038.84/s (n=29578)
Rate legacy@2^9 guess@2^9
legacy@2^9 5039/s -- -15%
guess@2^9 5943/s 18% --
# 800 -> 9.64385618977473
Benchmark: running guess@2^9, legacy@2^9 for at least 5 CPU seconds...
guess@2^9: 5 wallclock secs ( 5.05 usr + 0.01 sys = 5.06 CPU) @
5137.75/s (n=25997)
legacy@2^9: 6 wallclock secs ( 5.28 usr + 0.02 sys = 5.30 CPU) @
5325.28/s (n=28224)
Rate guess@2^9 legacy@2^9
guess@2^9 5138/s -- -4%
legacy@2^9 5325/s 4% --
# 850 -> 9.73131903102506
Benchmark: running guess@2^9, legacy@2^9 for at least 5 CPU seconds...
guess@2^9: 7 wallclock secs ( 6.81 usr + 0.00 sys = 6.81 CPU) @
4262.85/s (n=29030)
legacy@2^9: 6 wallclock secs ( 5.26 usr + 0.00 sys = 5.26 CPU) @
5061.98/s (n=26626)
Rate guess@2^9 legacy@2^9
guess@2^9 4263/s -- -16%
legacy@2^9 5062/s 19% --
# 900 -> 9.81378119121704
Benchmark: running guess@2^9, legacy@2^9 for at least 5 CPU seconds...
guess@2^9: 7 wallclock secs ( 6.77 usr + 0.01 sys = 6.78 CPU) @
3569.17/s (n=24199)
legacy@2^9: 5 wallclock secs ( 5.23 usr + 0.00 sys = 5.23 CPU) @
4240.54/s (n=22178)
Rate guess@2^9 legacy@2^9
guess@2^9 3569/s -- -16%
legacy@2^9 4241/s 19% --
# 950 -> 9.89178370321831
Benchmark: running guess@2^9, legacy@2^9 for at least 5 CPU seconds...
guess@2^9: 7 wallclock secs ( 6.72 usr + 0.00 sys = 6.72 CPU) @
3408.18/s (n=22903)
legacy@2^9: 6 wallclock secs ( 5.63 usr + 0.00 sys = 5.63 CPU) @
3866.43/s (n=21768)
Rate guess@2^9 legacy@2^9
guess@2^9 3408/s -- -12%
legacy@2^9 3866/s 13% --
# 1000 -> 9.96578428466209
Benchmark: running guess@2^9, legacy@2^9 for at least 5 CPU seconds...
guess@2^9: 5 wallclock secs ( 5.03 usr + 0.00 sys = 5.03 CPU) @
3530.42/s (n=17758)
legacy@2^9: 5 wallclock secs ( 5.22 usr + 0.00 sys = 5.22 CPU) @
3922.22/s (n=20474)
Rate guess@2^9 legacy@2^9
guess@2^9 3530/s -- -10%
legacy@2^9 3922/s 11% --
# took 1216979 samples
# avg size 439.16
1..1
ok 1
<Heresey>
it could go into 5.10.1
- .0 release reserved as unstable ?? (this is key to the argument)
- ptr-table code in core since 12/1999 gsar
- in use by Storable since 2006 at least
- already "IN" the api - at least Perl_ptr_table_new is there, without the
_n
[jimc@harpo perl-5.10.0]$ grep ptr_table embed.fnc
Apa |PTR_TBL_t*|ptr_table_new
ApR |void* |ptr_table_fetch|NN PTR_TBL_t *tbl|NN const void *sv
Ap |void |ptr_table_store|NN PTR_TBL_t *tbl|NULLOK const void
*oldsv|NN void *newsv
Ap |void |ptr_table_split|NN PTR_TBL_t *tbl
Ap |void |ptr_table_clear|NULLOK PTR_TBL_t *tbl
Ap |void |ptr_table_free|NULLOK PTR_TBL_t *tbl
sRn |PTR_TBL_ENT_t *|ptr_table_find|NN PTR_TBL_t *tbl|NN const void *sv
In fact, theres so much there already,
that we should just go ahead (in 5.12) and add
ptr_table_delete
ptr_table_foreach
The counter-argument
- ptr-tables were never intended for broad use in XS.
--- therefore no foreach, delete
- 2**N gets pretty lumpy when N>20
- ptr tables are currently completely invisible to perl-code
- lifetime of ptr-tables is currently ephemeral
--- caveat - I know squat about thread-cloning, despite building this way by
default (someone's got to)
- Storable makes and destroys them before the function call returns. (this
is clear)
- exposing them to XS completely changes that environment
-- changes nothing, since its already happened.
-- just cuz Storable doesnt persist tables over many perl statements, doesnt
mean someone else doesnt.
-- otoh, lack of delete, foreach limit utility of such.
Well, that should set it ablaze,
(which one should not do without a patch, attached)
please toss it on your personal firepit, see how it cooks.
following with Storable patch shortly.
thanks,
Jim Cromie
Thread Next
-
[patch] add Perl_ptr_table_new(initsize) Was: 5.12 release/roadmap?
by Jim Cromie