develooper Front page | perl.recdescent | Postings from November 2006

PRD parser expontentially slower with larger files

Thread Next
From:
Jan Sundberg
Date:
November 30, 2006 07:39
Subject:
PRD parser expontentially slower with larger files
Message ID:
456EBBFB.2070301@janne.nu
Hello.

I have a simple grammar (used in a POE Filter) like:

================
dictionary : '{' key_value(s /\|/) '}'
key_value : key '=' value
key : /[A-Za-z0-9_ ]+/
value : dictionary | string
string : /[^|\}]*/
================

with a valid test string like:

{reply_to={message_type=portfolio_get|private=QQQ}|portfolio_name=P|ordervalidate=OK}

which dumps like:

{
  'ordervalidate' => 'OK',
  'portfolio_name' => 'P',
  'reply_to' => {
    'message_type' => 'portfolio_get',
    'private' => 'QQQ'
  }
}



And a parser:

================
#!/usr/bin/perl

use strict;
use warnings;

use Parse::RecDescent;

$::RD_HINT = 1;
$::RD_ERRORS = 1;
$::RD_WARN = 1;

my $parser = new Parse::RecDescent(q{
	dictionary : '{' key_value(s /\|/) '}'
	{ $return = { map { $_->[0] => $_->[1] } @{$item[2]} } }
        key_value : key '=' value
	{ [ @item{qw(key value)} ] }
 	key : /[A-Za-z0-9_ ]+/
	value : dictionary | string
	string : /[^|\}]*/
    });

while (<>) {
    my $msg = $parser->dictionary($_);
#    print Dumper($msg);
}
==============





However, it gets very slow with larger files. If I generate data files with adjustable size with:

==============
#!/usr/bin/perl

use strict;
use warnings;

print "{reply_to={message_type=portfolio_get|private=PP}|portfolio_name=PP|ordervalidate=OK|portfolio_positions={";

print join "|", map "portfolio_position$_={instrument_id={instrument_tag=$_|market=MMMM|feedcode=FFFF|isincode=IIII|underlying=UUUU|kind=Spot|currency=CUR|multiplier=2|assettype=Equities}|volume=$_|invested=$_|accrued=$_|fee=$_|commission=$_|change_in_volume=$_|change_in_invested=$_|change_in_accrued=$_|change_in_fee=$_|change_in_commission=$_|currency=CUR}", 1 .. shift;

print "}}|error=0}";
==============



I get execution times like

size	filesiz	time
10	3398	.25
20	6788	.42
50	16958	1.17
100	33920	3.29
200	69020	10.9
300	104120	24.1
400	139220	42.0
500	174320	63.8
1000	349832	264
1500	531332	605

and these are kernel user time, not wallclock - it's eating all CPU a for a long time!
And the files aren't *that* big.

How can I get it faster ? It seems already simple enough. Please help!


Sincerely.

/ Jan Sundberg




Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About