develooper Front page | perl.perl5.porters | Postings from April 2008

Re: Make built-in list functions continuous

Thread Previous | Thread Next
From:
wren argetlahm
Date:
April 1, 2008 01:55
Subject:
Re: Make built-in list functions continuous
Message ID:
47F1D7F5.2010502@cpan.org
Nicholas Clark wrote:
> At http://www.perlfoundation.org/perl5/index.cgi?gsoc2008_projects#make_built_in_list_functions_continuous
> I spot:
> 
>   Make built-in list functions continuous
> 
>   Functions like map often build up intermediate lists
>   unnecessarily, fix this.
>   Mentor: wren ng thornton
> 
> 
> I was wondering if you had more detail on this, as it's
>  not something currently in the core's TODO file:
> 
>   http://mirrors.develooper.com/perl/APC/perl-current/pod/perltodo.pod
> 
> 
> Nicholas Clark

It was mainly something I ran into while optimizing some data-mining 
code a little while back. The margin for speedups is quite dramatic for 
large arrays, and simple cases can be dealt with by fairly trivial 
rewrites. I'm not sure how much more involved a comprehensive solution 
would be; the answer to that would depend on the approach taken (e.g. 
transforming the Perl code before running it, vs rewriting the offending 
functions) as well as the coverage for ferreting out everywhere it happens.

The general genre of this sort of optimization is called 
"deforestation". If you're familiar with Haskell or other ML-like 
languages, this paper[1] discusses a high-level example of the sort of 
thing involved. The details of their solution may not be very pertinent 
to the Perl interpreter, but they give an excellent background on the 
problem and on the known solutions.

The problem is when you have a pipeline of functions and some function 
constructs an intermediate array which will then be read and destroyed 
right away by the next function in the pipeline. A simple example of 
this is the following code where the pipeline is formed by the (=) 
assignment function and the `map` function:

     @xs = map { f($_) } @ys;

In that code, `map` will construct an array in memory, but when it 
returns it, the array will be copied cell by cell into a new array 
constructed as @xs. We can optimize this to remove that intermediate 
array by using the following code instead:

     @xs = ();
     push @xs, f($_) foreach @ys;

This version constructs each element of @xs and then immediately pushes 
it through the "assignment" rather than building them up in memory first.

A different approach would be to rewrite the original code to use a 
different (=) function which just makes @xs an alias to the intermediate 
array returned by map, rather than copying it. This would be along the 
lines of the lexical aliasing task on the perltodo, but it doesn't 
generalize. For example, when we have longer pipelines and so the 
intermediate array needs to be pushed onto @_:

     @xs = map {f($_)} grep {p($_)} @ys;

But, this too can be rewritten as a foreach-loop:

     foreach (@ys) {
         push @xs, f($_) if p($_);
     }

We can transform the majority of the offending cases into foreach-loops 
which capture the fact that these chains of functions are actually 
forming a pipeline through which we're funneling the data streaming from 
@ys. But these rewrites aren't very amenable to using the Filter modules 
since the first argument to `grep` and `map` are blocks, so:

     @xs = map { f1($_) ; f2($_) ;...; fn($_) }
          grep { p1($_) ; p2($_) ;...; pn($_) } @ys;

...becomes:

     foreach (@ys) {
         p1($_);
         p2($_);
         ...
         if (pn($_)) {
             f1($_);
             f2($_);
             ...
             push @xs, fn($_);
         }
     }

Because of this ugliness, the rewrite needs to take place within the 
parsing/compilation pass in order to have access to the AST.

There are other cases where we can't perform this transformation at all, 
such as when we have multiple streams coming together and we need to 
combine them into a single stream to pass along:

     @xs = (map {f($_)} @ys, map {g($_)} @zs);

Granted, that's something of a contrived example, but you get the idea. 
There's also a related optimization which takes the composition of maps 
as the map of composition. I.e. these two lines are equivalent, but the 
latter is more efficient:

     @xs = map {f($_)} map {g($_)} @ys;
     @xs = map {$_ = f($_); g($_)} @ys;

(If you're familiar with Haskell, that's saying that "map f . map g == 
map (f . g)" and the latter is more efficient since it only iterates 
over the list once, as well as not constructing the intermediate list.)

The most-general solution which also captures cases where we need to 
combine streams is to reimplement the offending functions as "continuous 
functions" which explicitly take a stream datastructure as input and 
return a stream as output. The trick is that unlike normal arrays, these 
streams are lazily constructed so the function can read one element off 
at a time, process it, and write anything appropriate into the return 
stream before reading the next element. When the right function writes 
to its return stream the left function gets notified and processes the 
elements right away; thus the only intermediate arrays built up are the 
size of each atomic write to the stream (usually only one element).

One method of implementing continuous functions is to use a lazy array 
like mentioned above; though integrating this into an otherwise non-lazy 
language can require thought. Another approach is to actually spawn off 
threads and pass messages between them; while nicely parallel for 
multi-core folks, usually the overhead is too much unless the functions 
are doing some heavy computations (using a thread pool can reduce the 
overhead somewhat). Yet another approach is to use continuation passing 
in order to capture the flow of control; how nice this approach is 
depends on the implementation of closures. And of course there are other 
approaches as well.

Going all the way and implementing continuous functions would be 
awesome, but it'd also be a good deal of work which might be better 
spent elsewhere. But even just doing the rewrites can improve 
performance for data-heavy programs by 160~200% (== +60~+100%), even 
though it doesn't catch everything. For someone who knows how the 
parsing/compilation sweep works (or can quickly learn) so they can to 
manipulate the ASTs/opcode tree, adding the rewrites should be about the 
right size for a Google Summer of Code project.


[1] http://www.cse.unsw.edu.au/~dons/papers/CSL06.html

-- 
Live well,
~wren

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About