develooper Front page | perl.perl6.language.data | Postings from September 2000

RFC 207 (v1) Array: Efficient Array Loops

From:
Perl6 RFC Librarian
Date:
September 8, 2000 15:40
Subject:
RFC 207 (v1) Array: Efficient Array Loops
Message ID:
20000908224027.21520.qmail@tmtowtdi.perl.org
This and other RFCs are available on the web at
  http://dev.perl.org/rfc/

=head1 TITLE

Array: Efficient Array Loops

=head1 VERSION

  Maintainer: Buddha Buck <bmbuck@14850.com>
  Date: 8 September 2000
  Mailing List: perl6-language-data@perl.org
  Number: 207
  Version: 1
  Status: Developing

=head1 ABSTRACT

This RFC proposes a notation for creating efficient implicit loops over
multidimensional arrays. It introduces the notation |i for an index
iterator for arrays, allowing temporary multidimensional arrays to be
created on the fly.

=head1 DESCRIPTION

Consider the problem of multiplying together two 2-dimensional tensors. In
standard notation, this would be symbolized by

   Cijkl = Aij * Bkl

where the letters i, j, k and l are written as subscripts and represent
the indices of their respective tensors. To accomplish that same
multiplication in Perl, one needs to write (using RFC 204 notation):

  for my $i = (0..2) {    #assuming 3x3 tensors
    for my $j = (0..2) {
      for my $k = (0..2) {
        for my $l = (0..2) {
            $c[[$i,$j,$k,$l]] = $a[[$i,$j]] * $b[[$k,$l]] ;
        }
      }
    }
  }

While this is not particularly difficult, it is clumsy, and slow. It could
be hidden in a subroutine, but then one lacks the flexibility to write
similar equations on the fly. For instance, transposition is easily
written as Tij = Aji. Such notation is assumed to be true for all legal
values of i and j -- in effect, it loops over all legal values of i and j
to compute the results. Furthermore, such notation along with appropriate
restrictions on use could allow Perl to create optimised loops.

This RFC proposes a similar notation, using |name as a notation for
looping indices like i and j above.

=head2 Syntax details

A set of iterators |i,|j, etc used as indices in an array operation would
implicitly create a loop around the smallest expression containing the
iterators, generating as a result a multidimensional array whose bounds
are determined by the ranges of the iterators and whose dimension is equal
to the total number of iterators (named or unnamed)..

Iterators are limited in scope to the current statement, and the effect of
an iterator is limited to the smallest expresion containing all iterators
within a statement.

The most general use of an iterator would be as |i=@list, which would
create a bounding loop of the form "for |i (@list) {...}" around the
expression, noting that |i will not take on any value that is
out-of-bounds for the matrices it indexes.

If no list is specified, (0..) is used (and will stop when the index is
out-of-bounds). This is probably the most common use of them.

Using Multiple iterators will in effect cause a nested loop, the nesting
order of the iterators determined by the relationships between the
iterators. One iterator's bounds may depend on another's value (such as
@mat[|i;|j=0..|i] returning the upper triangle of mat), but it is illegal
for there to be a circular dependency.

In expressions containing iterators and cartesian-product style matrix
slices (e.g., @matrix[|x;@y]), each explicit or implicit non-singleton
list acts as if it were an anonymous iterator using the explicit (or
implicit (0..)) list. This allows for expressions like the "general tensor
multiplication above".

It is illegal to use an iterator in an expression without using it
somewhere in the expression as a matrix index.

The value of an expression containing an iterator is an n-dimensional
matrix where n is the number of iterators (plus lists used in
cartesian-product slices), and whose elements are the values of the
expression evaluated at each allowable interator value.

Evaluation should be lazy, if possible.

=head2 Examples:

  @t[|i;|j] = @a[|j;|i];  # transpose 2-d @a

would be equivilant to:

  sub { 
    my @result; my $i; my $j
    for $i (0..) { # last if out-of-bounds
      for $j (0..) { # last if out-of-bounds
        @result[[$i,$j]] = ($t[[$i,$j]] = $a[[$j,$i]]);
      }
    }
    @result;
  }->();
  
  # tensor multiplication
  $c[|i;|j;|k;|l;|m] = $a[|i;|j] * $b[|k;|l;|m];
  # or
  @c = $a[|i;|j] * $b[|k;|l;|m];
  # or even
  @c = $a[|i;] * $b[|k;];  # general tensor multiplication
  
  # upper diagonal
  $upper[|i;|j] = $a[|i;|j=(0..|i)];
  
  #compute pairwise sum, pairwise product, pairwise difference...
  @sum = @a[|i;|j;|k;|l] + @b[|i;|j;|k;|l];
  @prod= @a[|i;|j;|k;|l] * @b[|i;|j;|k;|l];
  @diff= @a[|i;|j;|k;|l] - @b[|i;|j;|k;|l];
  
  #Generate multiplication table
  @multtable[|i;|j] = |i * |j;
  
  #print lots of stuff to the screen.
  sub foo { print join(',',@_),"\n"; return 0; }
  $zero[|i;|j;|k;|l;|m;|n;|o;|p] = foo(|i,|j,|k,|l,|m,|n,|o,|p);
    
  # Sneaky way to generate dot-product:
  my $dotproduct;
  { my @temp[|i] = $dotproduct += $a[|i] * $b[|i]; }
  
This would be equivalant to:

  my $dotproduct;
  my @temp :bounds(@#a);
  sub {
    my @result; my $i;
    for $i (0..) { last if $i > $#a[0];
       $result[$i] = ($temp[$i] = $dotproduct += $a[$i] * $b[$i]);
    }
    return @result;
  }->();
  
  #non-sneaky way to generate a dot-product:
  my $dotproduct = reduce {^_+^_} 0,$a[|i] * $b[|i];
  
=head1 IMPLEMENTATION

For non-lazy evaluation, converting the smallest expression containing the
iterators into an optimized version of the anonymous subroutines shown
above would be sufficient. Optimizations include: Eliminating result
values if not used (i.e., if the iterators are being used purely for
side-effect purposes); encoding internally to some non-interpreted looping
construct, etc. If special 'numeric functions' are provided in Perl, then
expressions with just unoverloaded operators and numeric functions could
be optimised into tight compiled loops, as occurs for example with
fromfunction() and ufuncs in Numeric Python:

  http://starship.python.net/~da/numtut/array.html#SEC8
  http://starship.python.net/~da/numtut/array.html#SEC13

For lazy evaluation, the value of the expression at any given set of
indices is easy to calculate. However the lazy evaluation mechanism works,
it can use this property to calculate the appropriate values.

=head1 REFERENCES

RFC 203: Notation for declaring and creating arrays

RFC 204: Notation for indexing arrays with an LOL as an index

RFC 205: New operator ';' for creating array slices




nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About