develooper Front page | perl.perl5.porters | Postings from June 2009

[patch] add Perl_ptr_table_new(initsize) Was: 5.12 release/roadmap?

Thread Next
From:
Jim Cromie
Date:
June 2, 2009 16:14
Subject:
[patch] add Perl_ptr_table_new(initsize) Was: 5.12 release/roadmap?
Message ID:
cfe85dfa0906021614g747fda59n624a59dcfbee9f6f@mail.gmail.com
On Thu, May 28, 2009 at 7:40 AM, Richard Foley <Richard.Foley@rfi.net>wrote:

> Is there a 5.12 release plan?
>
> If so, what is it, please ?-)
>


My particular 5.12 Itch :

1- add Perl_ptr_table_new_n (init_size) to API
2- patch Storable to use it, if Storable user wants to estimate the size of
their need.

Heres why:

A Ive done 1,2 already

1 is a simpler version of my recent rfc patch
http://www.nntp.perl.org/group/perl.perl5.porters/2009/05/msg146734.html
- it does initial-size only, no Growth, no PTR_TBL_t changes
- now init_size is linear, not the 2**N size of the table
-- easier to use, less prone to sucking all the memory.

2 my Demo-Only hack to Storable
  xs reads $Storable::Size in init_store_context(),
    if thats set to something, its an active presize, and is used.
  at end of freeze, before ptr-table is freed
    if same Storable::Size is set, we read the ptr-table, and sets
Storable::Sizeout

3 my benchmark script shows ..

[jimc@harpo foo1]$ ./perl -Ilib ../storit.pl -x100 -y100 -r10 -b -s32
# x*y -> guess 10000 ~~> 2**13

# 320000 -> 18.2877123795495
Benchmark: running   guess@2^18,   legacy@2^18 for at least 3 CPU seconds...
  guess@2^18:  4 wallclock secs ( 3.84 usr +  0.16 sys =  4.00 CPU) @
9.50/s (n=38)
  legacy@2^18:  3 wallclock secs ( 3.04 usr +  0.06 sys =  3.10 CPU) @
6.45/s (n=20)
                Rate   legacy@2^18    guess@2^18
  legacy@2^18 6.45/s            --          -32%
  guess@2^18  9.50/s           47%            --

# 640000 -> 19.2877123795495
Benchmark: running   guess@2^19,   legacy@2^19 for at least 3 CPU seconds...
  guess@2^19:  4 wallclock secs ( 3.11 usr +  0.12 sys =  3.23 CPU) @
4.02/s (n=13)
  legacy@2^19:  3 wallclock secs ( 3.02 usr +  0.06 sys =  3.08 CPU) @
2.60/s (n=8)
                Rate   legacy@2^19    guess@2^19
  legacy@2^19 2.60/s            --          -35%
  guess@2^19  4.02/s           55%            --

# 960000 -> 19.8726748802706
Benchmark: running   guess@2^19,   legacy@2^19 for at least 3 CPU seconds...
  guess@2^19:  3 wallclock secs ( 3.41 usr +  0.09 sys =  3.50 CPU) @
2.57/s (n=9)
  legacy@2^19:  3 wallclock secs ( 3.31 usr +  0.06 sys =  3.37 CPU) @
1.78/s (n=6)
                Rate   legacy@2^19    guess@2^19
  legacy@2^19 1.78/s            --          -31%
  guess@2^19  2.57/s           44%            --

# 1280000 -> 20.2877123795495
Benchmark: running   guess@2^20,   legacy@2^20 for at least 3 CPU seconds...
  guess@2^20:  3 wallclock secs ( 3.10 usr +  0.11 sys =  3.21 CPU) @
1.87/s (n=6)
  legacy@2^20:  4 wallclock secs ( 3.48 usr +  0.06 sys =  3.54 CPU) @
1.13/s (n=4)
                Rate   legacy@2^20    guess@2^20
  legacy@2^20 1.13/s            --          -40%
  guess@2^20  1.87/s           65%            --

# 1600000 -> 20.6096404744368
Benchmark: running   guess@2^20,   legacy@2^20 for at least 3 CPU seconds...
  guess@2^20:  4 wallclock secs ( 3.31 usr +  0.10 sys =  3.41 CPU) @
1.47/s (n=5)
  legacy@2^20:  3 wallclock secs ( 3.09 usr +  0.06 sys =  3.15 CPU) @
0.95/s (n=3)
            (warning: too few iterations for a reliable count)
              s/iter   legacy@2^20    guess@2^20
  legacy@2^20   1.05            --          -35%
  guess@2^20   0.682           54%            --

# 1920000 -> 20.8726748802706
Benchmark: running   guess@2^20,   legacy@2^20 for at least 3 CPU seconds...
  guess@2^20:  4 wallclock secs ( 3.38 usr +  0.10 sys =  3.48 CPU) @
1.15/s (n=4)
  legacy@2^20:  4 wallclock secs ( 3.69 usr +  0.06 sys =  3.75 CPU) @
0.80/s (n=3)
            (warning: too few iterations for a reliable count)
              s/iter   legacy@2^20    guess@2^20
  legacy@2^20   1.25            --          -30%
  guess@2^20   0.870           44%            --

# 2240000 -> 21.0950673016071
Benchmark: running   guess@2^21,   legacy@2^21 for at least 3 CPU seconds...
  guess@2^21:  4 wallclock secs ( 3.73 usr +  0.13 sys =  3.86 CPU) @
1.04/s (n=4)
  legacy@2^21:  4 wallclock secs ( 3.28 usr +  0.07 sys =  3.35 CPU) @
0.60/s (n=2)
            (warning: too few iterations for a reliable count)
              s/iter   legacy@2^21    guess@2^21
  legacy@2^21   1.67            --          -42%
  guess@2^21   0.965           74%            --

# 2560000 -> 21.2877123795495
Benchmark: running   guess@2^21,   legacy@2^21 for at least 3 CPU seconds...
  guess@2^21:  3 wallclock secs ( 3.32 usr +  0.10 sys =  3.42 CPU) @
0.88/s (n=3)
            (warning: too few iterations for a reliable count)
  legacy@2^21:  4 wallclock secs ( 3.61 usr +  0.08 sys =  3.69 CPU) @
0.54/s (n=2)
            (warning: too few iterations for a reliable count)
              s/iter   legacy@2^21    guess@2^21
  legacy@2^21   1.85            --          -38%
  guess@2^21    1.14           62%            --

# 2880000 -> 21.4576373809918
Benchmark: running   guess@2^21,   legacy@2^21 for at least 3 CPU seconds...
  guess@2^21:  4 wallclock secs ( 3.74 usr +  0.11 sys =  3.85 CPU) @
0.78/s (n=3)
            (warning: too few iterations for a reliable count)
  legacy@2^21:  4 wallclock secs ( 3.89 usr +  0.08 sys =  3.97 CPU) @
0.50/s (n=2)
            (warning: too few iterations for a reliable count)
              s/iter   legacy@2^21    guess@2^21
  legacy@2^21   1.99            --          -35%
  guess@2^21    1.28           55%            --

# 3200000 -> 21.6096404744368
Benchmark: running   guess@2^21,   legacy@2^21 for at least 3 CPU seconds...
  guess@2^21:  4 wallclock secs ( 4.24 usr +  0.12 sys =  4.36 CPU) @
0.69/s (n=3)
            (warning: too few iterations for a reliable count)
  legacy@2^21:  5 wallclock secs ( 4.29 usr +  0.08 sys =  4.37 CPU) @
0.46/s (n=2)
            (warning: too few iterations for a reliable count)
              s/iter   legacy@2^21    guess@2^21
  legacy@2^21   2.19            --          -33%
  guess@2^21    1.45           50%            --
# took 109 samples
#  avg size 1027112.93
1..1
ok 1


Those are non-trivial performance improvements.

- this is an application relevant benchmark.
- Storable heavy apps will benefit.
- Storable workloads are poster-child of unpredictability
-- except at the user/app level
-- many users will have an predicatable loads
-- estimatable from size of previously stored data.
- total reps is low (n=3), indicating where time was spent.


For Small PTR-Tables, things are much tighter
- no benefit to guessing below 2**9
-- this seems sensible, alloc still done, 512 byte slabs are cheap.
-- most samples below 512 show ~ 1-2% loss vs legacy
- for tables > 512 - guessing looks 10%

[jimc@harpo foo1]$ ./perl -Ilib ../storit.pl -x5 -y10 -r20 -b5
# x*y -> guess 50 ~~> 2**5

# 50 -> 5.64385618977472
Benchmark: running   guess@2^5,   legacy@2^5 for at least 5 CPU seconds...
  guess@2^5:  5 wallclock secs ( 5.28 usr +  0.00 sys =  5.28 CPU) @
26893.37/s (n=141997)
  legacy@2^5:  6 wallclock secs ( 5.24 usr +  0.00 sys =  5.24 CPU) @
27479.01/s (n=143990)
                Rate    guess@2^5   legacy@2^5
  guess@2^5  26893/s           --          -2%
  legacy@2^5 27479/s           2%           --

# 100 -> 6.64385618977473
Benchmark: running   guess@2^6,   legacy@2^6 for at least 5 CPU seconds...
  guess@2^6:  5 wallclock secs ( 5.25 usr +  0.00 sys =  5.25 CPU) @
21763.43/s (n=114258)
  legacy@2^6:  6 wallclock secs ( 5.24 usr +  0.01 sys =  5.25 CPU) @
22122.10/s (n=116141)
                Rate    guess@2^6   legacy@2^6
  guess@2^6  21763/s           --          -2%
  legacy@2^6 22122/s           2%           --

# 150 -> 7.22881869049588
Benchmark: running   guess@2^7,   legacy@2^7 for at least 5 CPU seconds...
  guess@2^7:  5 wallclock secs ( 5.25 usr +  0.00 sys =  5.25 CPU) @
18100.00/s (n=95025)
  legacy@2^7:  6 wallclock secs ( 5.22 usr +  0.00 sys =  5.22 CPU) @
18547.32/s (n=96817)
                Rate    guess@2^7   legacy@2^7
  guess@2^7  18100/s           --          -2%
  legacy@2^7 18547/s           2%           --

# 200 -> 7.64385618977472
Benchmark: running   guess@2^7,   legacy@2^7 for at least 5 CPU seconds...
  guess@2^7:  7 wallclock secs ( 5.38 usr +  0.02 sys =  5.40 CPU) @
15170.19/s (n=81919)
  legacy@2^7:  6 wallclock secs ( 5.24 usr +  0.01 sys =  5.25 CPU) @
15999.81/s (n=83999)
                Rate    guess@2^7   legacy@2^7
  guess@2^7  15170/s           --          -5%
  legacy@2^7 16000/s           5%           --

# 250 -> 7.96578428466209
Benchmark: running   guess@2^7,   legacy@2^7 for at least 5 CPU seconds...
  guess@2^7:  6 wallclock secs ( 5.26 usr +  0.01 sys =  5.27 CPU) @
12397.15/s (n=65333)
  legacy@2^7:  6 wallclock secs ( 5.27 usr +  0.01 sys =  5.28 CPU) @
13882.39/s (n=73299)
                Rate    guess@2^7   legacy@2^7
  guess@2^7  12397/s           --         -11%
  legacy@2^7 13882/s          12%           --

# 300 -> 8.22881869049588
Benchmark: running   guess@2^8,   legacy@2^8 for at least 5 CPU seconds...
  guess@2^8:  6 wallclock secs ( 5.25 usr +  0.00 sys =  5.25 CPU) @
12218.10/s (n=64145)
  legacy@2^8:  5 wallclock secs ( 5.27 usr +  0.01 sys =  5.28 CPU) @
12373.67/s (n=65333)
                Rate    guess@2^8   legacy@2^8
  guess@2^8  12218/s           --          -1%
  legacy@2^8 12374/s           1%           --

# 350 -> 8.45121111183233
Benchmark: running   guess@2^8,   legacy@2^8 for at least 5 CPU seconds...
  guess@2^8:  5 wallclock secs ( 5.28 usr +  0.01 sys =  5.29 CPU) @
10977.32/s (n=58070)
  legacy@2^8:  5 wallclock secs ( 5.31 usr +  0.00 sys =  5.31 CPU) @
11142.37/s (n=59166)
                Rate    guess@2^8   legacy@2^8
  guess@2^8  10977/s           --          -1%
  legacy@2^8 11142/s           2%           --

# 400 -> 8.64385618977473
Benchmark: running   guess@2^8,   legacy@2^8 for at least 5 CPU seconds...
  guess@2^8:  6 wallclock secs ( 5.29 usr +  0.00 sys =  5.29 CPU) @
10064.65/s (n=53242)
  legacy@2^8:  5 wallclock secs ( 5.23 usr +  0.01 sys =  5.24 CPU) @
10160.69/s (n=53242)
                Rate    guess@2^8   legacy@2^8
  guess@2^8  10065/s           --          -1%
  legacy@2^8 10161/s           1%           --

# 450 -> 8.81378119121704
Benchmark: running   guess@2^8,   legacy@2^8 for at least 5 CPU seconds...
  guess@2^8:  6 wallclock secs ( 5.32 usr +  0.00 sys =  5.32 CPU) @
8187.03/s (n=43555)
  legacy@2^8:  6 wallclock secs ( 5.27 usr +  0.01 sys =  5.28 CPU) @
8249.05/s (n=43555)
               Rate    guess@2^8   legacy@2^8
  guess@2^8  8187/s           --          -1%
  legacy@2^8 8249/s           1%           --

# 500 -> 8.96578428466209
Benchmark: running   guess@2^8,   legacy@2^8 for at least 5 CPU seconds...
  guess@2^8:  5 wallclock secs ( 5.30 usr +  0.00 sys =  5.30 CPU) @
7585.09/s (n=40201)
  legacy@2^8:  6 wallclock secs ( 5.33 usr +  0.00 sys =  5.33 CPU) @
7684.62/s (n=40959)
               Rate    guess@2^8   legacy@2^8
  guess@2^8  7585/s           --          -1%
  legacy@2^8 7685/s           1%           --

# 550 -> 9.10328780841202
Benchmark: running   guess@2^9,   legacy@2^9 for at least 5 CPU seconds...
  guess@2^9:  5 wallclock secs ( 5.33 usr +  0.01 sys =  5.34 CPU) @
7817.79/s (n=41747)
  legacy@2^9:  6 wallclock secs ( 5.34 usr +  0.00 sys =  5.34 CPU) @
7121.16/s (n=38027)
               Rate   legacy@2^9    guess@2^9
  legacy@2^9 7121/s           --          -9%
  guess@2^9  7818/s          10%           --

# 600 -> 9.22881869049588
Benchmark: running   guess@2^9,   legacy@2^9 for at least 5 CPU seconds...
  guess@2^9:  5 wallclock secs ( 5.25 usr +  0.00 sys =  5.25 CPU) @
7245.14/s (n=38037)
  legacy@2^9:  5 wallclock secs ( 5.24 usr +  0.00 sys =  5.24 CPU) @
6647.71/s (n=34834)
               Rate   legacy@2^9    guess@2^9
  legacy@2^9 6648/s           --          -8%
  guess@2^9  7245/s           9%           --

# 650 -> 9.34429590791582
Benchmark: running   guess@2^9,   legacy@2^9 for at least 5 CPU seconds...
  guess@2^9:  6 wallclock secs ( 5.38 usr +  0.01 sys =  5.39 CPU) @
5193.32/s (n=27992)
  legacy@2^9:  6 wallclock secs ( 5.31 usr +  0.01 sys =  5.32 CPU) @
6256.20/s (n=33283)
               Rate    guess@2^9   legacy@2^9
  guess@2^9  5193/s           --         -17%
  legacy@2^9 6256/s          20%           --

# 700 -> 9.45121111183233
Benchmark: running   guess@2^9,   legacy@2^9 for at least 5 CPU seconds...
  guess@2^9:  5 wallclock secs ( 5.40 usr +  0.02 sys =  5.42 CPU) @
6258.86/s (n=33923)
  legacy@2^9:  6 wallclock secs ( 5.02 usr +  0.01 sys =  5.03 CPU) @
4685.09/s (n=23566)
               Rate   legacy@2^9    guess@2^9
  legacy@2^9 4685/s           --         -25%
  guess@2^9  6259/s          34%           --

# 750 -> 9.55074678538324
Benchmark: running   guess@2^9,   legacy@2^9 for at least 5 CPU seconds...
  guess@2^9:  5 wallclock secs ( 5.23 usr +  0.04 sys =  5.27 CPU) @
5943.26/s (n=31321)
  legacy@2^9:  6 wallclock secs ( 5.86 usr +  0.01 sys =  5.87 CPU) @
5038.84/s (n=29578)
               Rate   legacy@2^9    guess@2^9
  legacy@2^9 5039/s           --         -15%
  guess@2^9  5943/s          18%           --

# 800 -> 9.64385618977473
Benchmark: running   guess@2^9,   legacy@2^9 for at least 5 CPU seconds...
  guess@2^9:  5 wallclock secs ( 5.05 usr +  0.01 sys =  5.06 CPU) @
5137.75/s (n=25997)
  legacy@2^9:  6 wallclock secs ( 5.28 usr +  0.02 sys =  5.30 CPU) @
5325.28/s (n=28224)
               Rate    guess@2^9   legacy@2^9
  guess@2^9  5138/s           --          -4%
  legacy@2^9 5325/s           4%           --

# 850 -> 9.73131903102506
Benchmark: running   guess@2^9,   legacy@2^9 for at least 5 CPU seconds...
  guess@2^9:  7 wallclock secs ( 6.81 usr +  0.00 sys =  6.81 CPU) @
4262.85/s (n=29030)
  legacy@2^9:  6 wallclock secs ( 5.26 usr +  0.00 sys =  5.26 CPU) @
5061.98/s (n=26626)
               Rate    guess@2^9   legacy@2^9
  guess@2^9  4263/s           --         -16%
  legacy@2^9 5062/s          19%           --

# 900 -> 9.81378119121704
Benchmark: running   guess@2^9,   legacy@2^9 for at least 5 CPU seconds...
  guess@2^9:  7 wallclock secs ( 6.77 usr +  0.01 sys =  6.78 CPU) @
3569.17/s (n=24199)
  legacy@2^9:  5 wallclock secs ( 5.23 usr +  0.00 sys =  5.23 CPU) @
4240.54/s (n=22178)
               Rate    guess@2^9   legacy@2^9
  guess@2^9  3569/s           --         -16%
  legacy@2^9 4241/s          19%           --

# 950 -> 9.89178370321831
Benchmark: running   guess@2^9,   legacy@2^9 for at least 5 CPU seconds...
  guess@2^9:  7 wallclock secs ( 6.72 usr +  0.00 sys =  6.72 CPU) @
3408.18/s (n=22903)
  legacy@2^9:  6 wallclock secs ( 5.63 usr +  0.00 sys =  5.63 CPU) @
3866.43/s (n=21768)
               Rate    guess@2^9   legacy@2^9
  guess@2^9  3408/s           --         -12%
  legacy@2^9 3866/s          13%           --

# 1000 -> 9.96578428466209
Benchmark: running   guess@2^9,   legacy@2^9 for at least 5 CPU seconds...
  guess@2^9:  5 wallclock secs ( 5.03 usr +  0.00 sys =  5.03 CPU) @
3530.42/s (n=17758)
  legacy@2^9:  5 wallclock secs ( 5.22 usr +  0.00 sys =  5.22 CPU) @
3922.22/s (n=20474)
               Rate    guess@2^9   legacy@2^9
  guess@2^9  3530/s           --         -10%
  legacy@2^9 3922/s          11%           --
# took 1216979 samples
#  avg size 439.16
1..1
ok 1



<Heresey>

it could go into 5.10.1
- .0 release reserved as unstable ?? (this is key to the argument)
- ptr-table code in core since 12/1999 gsar
- in use by Storable since 2006 at least
- already "IN" the api - at least Perl_ptr_table_new is there, without the
_n

[jimc@harpo perl-5.10.0]$ grep ptr_table embed.fnc
Apa    |PTR_TBL_t*|ptr_table_new
ApR    |void*    |ptr_table_fetch|NN PTR_TBL_t *tbl|NN const void *sv
Ap    |void    |ptr_table_store|NN PTR_TBL_t *tbl|NULLOK const void
*oldsv|NN void *newsv
Ap    |void    |ptr_table_split|NN PTR_TBL_t *tbl
Ap    |void    |ptr_table_clear|NULLOK PTR_TBL_t *tbl
Ap    |void    |ptr_table_free|NULLOK PTR_TBL_t *tbl
sRn    |PTR_TBL_ENT_t *|ptr_table_find|NN PTR_TBL_t *tbl|NN const void *sv

In fact, theres so much there already,
that we should just go ahead (in 5.12) and add
  ptr_table_delete
  ptr_table_foreach

The counter-argument

- ptr-tables were never intended for broad use in XS.
--- therefore no foreach, delete
- 2**N gets pretty lumpy when N>20
- ptr tables are currently completely invisible to perl-code
- lifetime of ptr-tables is currently ephemeral
--- caveat - I know squat about thread-cloning, despite building this way by
default (someone's got to)
- Storable makes and destroys them before the function call returns. (this
is clear)
- exposing them to XS completely changes that environment
-- changes nothing, since its already happened.
-- just cuz Storable doesnt persist tables over many perl statements, doesnt
mean someone else doesnt.
-- otoh, lack of delete, foreach limit utility of such.


Well, that should set it ablaze,
(which one should not do without a patch, attached)
please toss it on your personal firepit, see how it cooks.
following with Storable patch shortly.

thanks,
Jim Cromie


Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About