develooper Front page | perl.perl6.language | Postings from May 2005

Re: Nested captures

Thread Previous | Thread Next
From:
Damian Conway
Date:
May 9, 2005 05:52
Subject:
Re: Nested captures
Message ID:
427F5CE9.4090801@conway.org
Autrijus wrote:

> /me eagerly awaits new revelation from Damian...

Be careful what you wish for. Here's draft zero. ;-)

Note that there may still be bugs in the examples, or even in the design.
@Larry has thrashed this through pretty carefully, and Patrick has implemented 
it for PGE, but it's 10.30 at night after a full day's teaching, so I may have 
transcribed the post-thrashing, post-implementation corrections incorrectly. %-)

Damian

-----cut----------cut----------cut----------cut----------cut----------cut-----

=head1 Perl 6 rules capturing semantics

=head2 Match objects

All match attempts--successful or not--against any rule, subrule, or
subpattern (see below) return an object of (or derived from) class
C<Match>. That is:

     $match_obj = $str ~~ /pattern/;
     say "Matched" if $match_obj;

In any code that is not nested inside a rule, this returned object is
also automagically assigned to the lexical C<$/> variable. That is:

     $str ~~ /pattern/;
     say "Matched" if $/;

In any code that is nested inside a rule, the C<$/> variable holds the
surrounding rule's nascent C<Match> object (which can be modified via the
internal C<$/>. For example:

     $str ~~ / foo                             # Match 'foo'
               { $/ = new Match: :str<bar> }   # But pretend we matched 'bar'
             /;

C<Match> objects have methods that provide addition information about
the match. For example:

     if m/ def <ident> <codeblock> / {
         say "Found sub def between index $/.from() and index { $/.to()-1 }";
     }

A C<Match> object can also be treated as a boolean, an integer, a
string, an array, or a hash. See below.


=head2 Match results

A failed match returns a C<Match> object whose boolean value is false, whose
integer value is zero, whose string value is C<"">, and whose array and hash
components are empty. For example:

     "bard" ~~ /food/;
     say "Poet inedible" unless $/;

A successful match returns a C<Match> object whose boolean value is
true, whose integer value is typically 1 (except under the C<:g> or
C<:x> flags; see L<Capturing from non-singular matches>), whose string
value is the complete substring that was matched by the entire rule,
whose array component contains all subpattern (unnamed) captures, and
whose hash component contains all subrule (named) captures. For example:

     if ($/) {
         $count += $/;
         say "Matched the substring: $/";
         say "Parens captured: @{$/}";
         say 'Subrules captured:';
         for %{$/}.kv -> $subrule_name, $substr {
             say "\t$subrule_name: $substr";
         }
     }


=head2 Subpattern captures

Any part of a rule enclosed in capturing parentheses is called a
I<subpattern>. For example:


        #               subpattern
        #  _________________/\____________________
        # |                                       |
        # |       subpattern  subpattern          |
        # |          __/\__    __/\__             |
        # |         |      |  |      |            |
     m:w/ (I am the (walrus), ( khoo )**{2} kachoo) /;


Each subpattern in a rule produces a C<Match> object if it is
successfully matched. This object is assigned into the array inside the
C<Match> object belonging to the surrounding scope -- either the
C<Match> object of the innermost surrounding subpattern (if the
subpattern is nested) or else the C<Match> object of the rule itself.
These assignments to the array are, of course, undone if the subpattern
is backtracked out of.

For example, if the following pattern matched successfully:

        #                subpat-A
        #  _________________/\____________________
        # |                                       |
        # |         subpat-B  subpat-C            |
        # |          __/\__    __/\__             |
        # |         |      |  |      |            |
     m:w/ (I am the (walrus), ( khoo )**{2} kachoo) /;

then the C<Match> objects representing the matches made by subpat-B and
subpat-C would be successively assigned into the array inside subpat-A's
C<Match> object. Then subpat-A's C<Match> object would be assigned into the
array inside the C<Match> object for the entire rule (i.e. C<$/>'s array).

The array elements of a C<Match> object are referred to using either the
standard array access notation (e.g. C<$/[0]>, C<$/[1]>, C<$/[2]>, etc.)
or else via the corresponding lexically scoped numeric aliases (i.e.
C<$1>, C<$2>, C<$3>, etc.)

So:

     say "$/[1] found between $/[0] and $/[2]";

is the same as:

     say "$2 found between $1 and $3";

Note that the standard array access notation uses zero-based indices
(0,1,2...), whereas the corresponding numeric variables are
numbered by ordinal position (1,2,3...)

Since the array elements of the rule's C<Match> object (i.e. C<$/>)
store individual C<Match> objects representing the substrings that where
matched and captured by the first, second, third, etc. I<outermost>
(i.e. unnested) subpatterns, these elements can be treated like fully
fledged match results. For example:

     if m/ (\d\d\d\d)-(\d\d)-(\d\d) (BCE?|AD|CE)?/ {
         ($yr, $mon, $day) = ($1, $2, $3);    # Or: ($yr, $mon, $day) = $/[0..2]
         $era = $4 if $4;                     # Tests if 4th parens matched
         @datepos = ($1.from() .. $3.to()-1); # $1, $2, etc. are full Match objs
     }


=head2 Nested subpattern captures

Nested subpatterns (i.e. nested capturing parens) are I<not> captured
directly into the array of the rule's C<Match> object. Instead, the
captures made by nested subpatterns appear in the array inside the
C<Match> object belonging to the surrounding subpattern. This is quite
different to Perl 5 semantics:

      # Perl 5...
      #
      # $1-----------------------------  $5---------  $6--------------------
      # |     $2--  $3---------------  | |          | |     $7--  $8------  |
      # |     |   | |         $4--   | | |          | |     |   | |       | |
      # |     |   | |         |   |  | | |          | |     |   | |       | |
     m/ ( The (\S+) (guy|gal|g(\S+)  ) ) (sees|calls) ( the (\S+) (gal|guy) ) /;


In Perl 6, nested parens produce properly nested captures:

      # Perl 6...
      #
      # $1-----------------------------  $2---------  $3--------------------
      # |     $1[0] $1[1]------------  | |          | |     $3[0] $3[1]---  |
      # |     |   | |       $1[1][0] | | |          | |     |   | |       | |
      # |     |   | |         |   |  | | |          | |     |   | |       | |
     m/ ( The (\S+) (guy|gal|g(\S+)  ) ) (sees|calls) ( the (\S+) (gal|guy) ) /;


This means that the internal structure of the arrays in a rule's final
C<Match> object mirrors (and preserves!) both the nesting structure of
subpatterns in the rule, and the dynamic structure of the hierarchical
way in which those subpatterns matched. This "reconstructability" can be
taken even further (see L<The C<:parsetree> flag> below).

There may also be shortcuts for accessing nested components of a subpattern,
specifically:

      # Perl 6...
      #
      # $1-----------------------------  $2---------  $3--------------------
      # |     $1.1  $1.2-------------  | |          | |     $3.1  $3.2----  |
      # |     |   | |        $1.2.1  | | |          | |     |   | |       | |
      # |     |   | |         |   |  | | |          | |     |   | |       | |
     m/ ( The (\S+) (guy|gal|g(\S+)  ) ) (sees|calls) ( the (\S+) (gal|guy) ) /;

but this has not yet been decided.


=head2 Quantified subpattern captures

If a subpattern is directly quantified using any quantifier -- except C<?>,
or C<??> -- it no longer produces a single C<Match> object. Instead, it
produces an array of C<Match> objects, which will have been collected
from the sequence of individual matches made by the repeated subpattern.

Because a quantified subpattern returns an array of C<Match> objects,
the corresponding array element for the quantified capture will store an
array reference, rather than a single C<Match> object. For example:

         # $1       $2
     if m/ (\w+) \: (\w+ \s+)* / {
         say "Key was: $1";         # Unquantified subpat produces single Match
         say "Values were: @{$2}";  # Quantified subpat produces array of Matches
     }

Note that whether a quantified subpattern returns a single C<Match>
object, or an array of C<Match> objects is determined statically (by the
nature of the quantifier), not dynamically (by the actual number of
repetitions that occur in the match).

If a subpattern is directly quantified using the C<?> or C<??> quantifier,
it produces a single C<Match> object. That object is "successful" if the
subpattern did match, and "unsuccessful" if it was skipped. That is:

     if m/ next (\w+)? if (.*) / {
         say "Found a 'next'";
         say "(targeted at $1)" if $1;
         say "Condition was: $2";
     }

Note that if a capture is quantified as optional in this way, a C<Match>
object is I<always> generated and assigned into the array inside the
surrounding scope's C<Match> object. This ensures that the index/ordinal of
subsequent subpatterns can still be determined statically.


=head2 Indirectly quantified subpattern captures

A subpattern may sometimes be nested inside a quantified non-capturing
structure:

      #       non-capturing    quantified
      #  __________/\_________  __/\__
      # |                     ||      |
      # |   $1         $2     ||      |
      # |  _^_      ___^___   ||      |
      # | |   |    |       |  ||      |
     m/ [ (\w+) \: (\w+ \s+)* ]**{2...} /

Non-capturing brackets I<don't> create a separate nested lexical scope,
so the two subpatterns inside them are actually still in the rule's top-
level scope. Hence their top-level designations: C<$1> and C<$2>. Such
subpatterns are called "indirectly quantified" subpatterns. In
Perl 5, any repeated captures of this kind:

      # Perl 5 equivalent...
     m/ (?: (\w+) \: (\w+ \s+)* ){2,} /x

would overwrite the previous captures to C<$1> and C<$2> each time the
surrrounding non-capturing parens iterated. So C<$1> and C<$2> would
contain only the captures from the final repetition.

This does not happen in Perl 6. Any indirectly quantified subpattern is
treated like a directly quantified subpattern. Specifically, an
indirectly quantified subpattern also returns an array of C<Match>
objects, so the corresponding array element for the indirectly
quantified capture will store an array reference, rather than a single
C<Match> object.

     if m/ [ (\w+) \: (\w+ \s+)* ]**{2...} / {
         say "Keys were: @{$1}";
         say "Values were: @{$2}";
     }

Remember though that, if the outer quantified structure is a I<capturing>
structure (i.e. a subpattern) then it I<will> introduce a nested
lexical scope. That outer quantified structure will then
return an array of C<Match> objects representing the captures
of the inner parens for I<every> iteration (as described above).

Whereas using non-capturing parentheses for the outer quantifier causes
all of the inner subpatterns to flatten their captures into C<$1> and
C<$2>, using capturing parentheses for the outer quantifier retains the
internal match structure of each repetition. That is:


         #           $/[0]
         #  __________/\_________
         # |                     |
         # | $/[0][0]  $/[0][1]  |
         # |  _^_      ___^___   |
         # | |   |    |       |  |
     if m/ ( (\w+) \: (\w+ \s+)* )**{2...} / {

         # Outer subpattern ($/[0]) quantified, so $1 contains an array.
         # Let's iterate it...
         for @{$1}.kv => $i, $inner_subpatterns {

             # First inner subpattern ($/[0][0]) is unquantified, so it
             # produces a single Match...
             say "Key $i was: $inner_subpatterns[0]";

             # Second inner subpattern ($/[0][1]) is quantified, so it
             # produces an array of Matches...
             say "Values $i were: @{$inner_subpatterns[1]}";
         }
     }


=head2 Subpattern numbering

As the previous sections explained, the index/ordinal of a given subpattern
can always be statically determined. However, this does not mean that they
have to be monotonically increasing. Indeed, the hierarchical nature of nested
Perl 6 subpatterns already ensures that this is not the case.

But even when there is no nesting of subpatterns it can be much more useful
not to number all top-level subpattern sequentially, as Perl 5 does:

     # Perl 5...
                   # $1      $2    $3   $4    $5           $6
     $tune_up5 = qr/ (don't) (ray) (me) (for) (solar tea), (d'oh!)
                   # $7      $8      $9    $10       $11
                   | (every) (green) (BEM) (devours) (faces)
                   /x;

Specifically, there are significant advantages to numbering the
subpatterns in each branch of an alternation (i.e. oneither side of a
C<|>) independently, restarting the numbering at  the beginning of each
branch. And this is precisely what Perl 6 does:

     # Perl 6...
                   # $1      $2    $3   $4    $5           $6
     $tune_up6 = rx/ (don't) (ray) (me) (for) (solar tea), (d'oh!)
                   # $1      $2      $3    $4        $5
                   | (every) (green) (BEM) (devours) (faces)
                   /;

In other words, unlike in Perl 5, in Perl 6 $1 doesn't represent the
capture made by the first subpattern that appears in the rule; it
represents the capture made by the first subpattern of whichever
alternative actually matched.

And that is extremely useful because it means that the array inside <$/>
will not contain large numbers of leading C<undef> values
corresponding to unmatched subpatterns from failed alternatives:

     # Perl 5...
       @captures = $EGBDF =~ $tune_up5;

     # @captures is assigned: ( (undef)x6, qw(every green BEM devours faces) )

Instead, only the "meaningful" subpattern captures are returned:

     # Perl 6...
       @captures = $EGBDF ~~ $tune_up6;

     # @captures is assigned: <every green BEM devours faces>
     # (no leading undefs)

A more common example is likely to be a series of alternative commands:

     $cmd ~~ m:w/ (put)  (\S+) in   (\S+)
                | (get)  (\S+) from (\S+)
                | (save) (\S+) to   (\S+)
                / or next;

     ($cmd, $item, $location) = ($1, $2, $3);


Of course, the leading C<undef>s that Perl 5 would produce do convey
(albeit awkwardly) which alternative actually matched. If that
information is important, Perl 6 has several far cleaner ways to
preserve it. For example:

     rule alt (Str $n) { {$/ = $n} }

     m/ <alt tea>  (don't) (ray) (me) (for) (solar tea), (d'oh!)
      | <alt BEM>  (every) (green) (BEM) (devours) (faces)
      /;

     if ($/) {
         given $<alt> {
             when 'tea' { say "I hate solar tea" }
             when 'BEM' { say "I love bug-eyed monsters" }
         }
     }


It's even possible to mimic the monotonic Perl 5 semantics. See
L<Numbered scalar aliasing> below for details.


=head2 Subrule captures

Any call to a named rule within a pattern is known as a I<subrule>.

Any bracketed construct that is aliased (see L<Aliasing>) to a
named variable is also a subrule.

For example, this rule contains three subrules:

      # subrule       subrule      subrule
      #  __^__    _______^______    __^__
      # |     |  |              |  |     |
     m/ <ident>  $<spaces>:=(\s*)  <digit>+ /

Just like subpatterns, each successfully matched subrule within a rule
produces a C<Match> object. But, unlike subpatterns, that C<Match>
object is assigned to an entry of a hash. Specifically, to an entry of
the hash inside the C<Match> object corresponding to the innermost
surrounding rule or subpattern. For example:

      #  .... $/ ......................................
      # :                                              :
      # :              .......... $/[0] ............   :
      # :             :                             :  :
      # : $/<ident>   :        $/[0]<ident>         :  :
      # :   __^__     :           __^__             :  :
      # :  |     |    :          |     |            :  :
     m:w/  <ident> \: ( known as <ident> previously )? /


The hash entries of a C<Match> object are referred to using any of the
standard hash access notations (C<$/{'foo'}>, C<< $/<bar> >>, C<$/«baz»>,
etc.), or else via corresponding lexically scoped aliases (C<< $<foo> >>,
C<$«bar»>, C<< $<baz> >>, etc.)  So the previous example also implies:

      #    $<ident>             $1<ident>
      #     __^__                 __^__
      #    |     |               |     |
     m:w/  <ident> \: ( known as <ident> previously )? /


In other words, the hash elements of a rule's C<Match> object store nested
C<Match> objects, each of which represents a substring matched-and-captured by
a named subrule call (or by a capture that was aliased to a name using the
C<< $<name>:= >> syntax). For example:

     if m/ (<YYYY>)-(<MM>)-(<DD>) $<ERA>:=(BCE?|AD|CE)?/ {
         ($year, $month, $day) = ($<YYYY>, $<MM>, $<DD>);
         $era   = $<ERA> if $<ERA>;
         @indices = ($<YYYY>.from() .. $<DD>.to()-1);
     }

Note that it makes no difference whether the subrule is angle-bracketted (like
C<< <YYYY> >> or aliased (like C<< $<ERA>:= >>. The name's the thing.


=head2 Repeated captures of the same subrule

If a subrule appears two (or more) times in the same lexical scope
within a rule (i.e. within the same subpattern and alternation), or if
the subrule is quantified anywhere within the rule (except with C<?>
or C<??>), then its corresponding hash entry no longer stores a
C<Match> object.

Instead, just like a quantified subpattern, a directly quantified,
indirectly quantified, or explicitly repeated subrule results in an
array of C<Match> objects. Successive matches of the subrule (whether
from separate calls, or from a quantified repetition) append their
individual C<Match> objects to this array. For example, with two or more
subrules of the same name, the corresponding hash entry contains an
reference to an array, which in turn contains the individual C<Match>
objects from each subrule match:

     if m:w/ mv <file> <file> / {
         $from = $<file>[0];
         $to   = $<file>[1];
     }

Likewise, with an indirectly quantified subrule:

     if m:w/ mv [ <file> ]**{2} / {
         $from = $<file>[0];
         $to   = $<file>[1];
     }

Likewise, with both repetition and quantification:

     if m:w/ mv [ <file> ]+ <file> / {
         $to   = pop @{$<file>};
         @from = @{$<file>};
     }

Note that it is always possible to determine statically whether a particular
hash entry in C<$/> will be a scalar, or an array reference, simply by
counting the number of occurrences of the subrule in each lexical scope.

However, if a subrule is explicitly renamed (or aliased -- see L<Aliasing>),
then only the "final" name counts when deciding whether it is or isn't
repeated. For example:

     rule dir := rule file;

     if m:w/ mv <file> <dir> / {    # Only one occurrence of <file>, so scalar
         $from = $<file>;
         $to   = $<dir>;
     }


Likewise, I<none> of the following constructions cause C<< <file> >> to
produce an array of C<Match> objects, since in none of them are there
two or more C<< <file> >> subrules in the same lexical scope:

     if m:w/ (keep) <file> | (toss) <file> / {  # Each <file> is in a separate
                                                # alternation, hence not
                                                # repeated in any one scope
         $action = $1;
         $target = $<file>;
     }

     if m:w/ <file> \: (<file>|none)? / {  # Second <file> nested in subpattern
                                           # which confers different scope
         $actual  = $/<file>;
         $virtual = $/[0]<file> if $/[0]<file>;
     }

On the other hand, unaliased square brackets don't confer a separate
scope (because they don't have an associated C<Match> object). So:

     if m:w/ <file> \: [<file>|none]? / {       # Second <file> in same scope
         $actual  = $/<file>[0];
         $virtual = $/<file>[1] if $/<file>[1];
     }


=head2 Aliasing

Aliases can be named or numbered; may be scalar-, array-, or hash-like;
and may be applied to either capturing or non-capturing constructs.
The following sections explain the semantics of each of those dozen
combinations.


=head3 Named scalar aliases applied to non-capturing brackets

If an named scalar alias is applied to a set of non-capturing brackets:

        #          ___/non-capturing brackets\__
        #         |                             |
        #         |                             |
     m:w/ $<key>:=[ (<[A-E]>) (\d**{3..6}) (X?) ] /;

then the corresponding entry in the rule's hash is assigned a C<Match> object
whose:

=over

=item *

Boolean value is true,

=item *

Integer value is 1,

=item *

String value is the complete substring matched by the contents of the square
brackets,

=item *

Array and hash are both empty.

=back

This last outcome (the empty hash and array) might be surprising, but
it's a natural consequence of the fact that square brackets do not
create a nested lexical scope, so any subpattern or subrule captures
within the square brackets are in the rule's lexical scope, not in that
of the alias. Consequently, any subpatterns or subrules in the square
brackets still I<do> set the appropriate hash or array entries, but they
set the appropriate hash or array entries of the rule's C<Match> object,
not the C<Match> object of the alias.

That means, if the above example matches successfully:

=over

=item *

C<< $/<key> >> will contain the complete substring matched by the square
brackets (in a C<Match> object, as described above),

=item *

C<< $/[0] >> will contain the A-E letter,

=item *

C<< $/[1] >> will contain the digits,

=item *

C<< $/[2] >> will contain the optional X.

=back


=head3 Named scalar aliasing to subpatterns

On the other hand, if an named scalar alias is applied to a set of
I<capturing> parens:

        #          ______/capturing parens\_____
        #         |                             |
        #         |                             |
     m:w/ $<key>:=( (<[A-E]>) (\d**{3..6}) (X?) ) /;

then the capturing parens no longer capture into the array of the rule's
C<Match> object (like unadorned parens would). Instead the aliased parens
capture into the hash of the C<Match> object; specifically into the hash
element whose key is the alias name.

So, in the above example, a successful match sets
C<< $<key> >> (i.e. C<< $/<key> >>), but I<not> C<$1> (i.e. not C<< $/[0] >>).

Another way to think about it is that aliased parens create a kind of
lexically scoped named subrule; that the contents of the brackets are
treated as if they were part of a separate subrule whose name is the
alias. That is, the above example is exactly equivalent to:

     rule key { (<[A-E]>) (\d**{3..6}) (X?) }
     m:w/ <key> /;

Specifically, after either version matches:

=over

=item *

C<< $/<key>[0] >> will contain the A-E letter (in a C<Match> object, of course),

=item *

C<< $/<key>[1] >> will contain the digits,

=item *

C<< $/<key>[2] >> will contain the optional X.

=back

Note that only aliased parens have this "on-the-fly-subrule" effect.
Aliased square brackets (as explained in L<Named scalar aliases applied
to non-capturing brackets>) only capture the substring the square
brackets matched; any internal captures proceed exactly as they
would if the alias were not there.

This can provide a handy optimization when calling a subrule. If only the
complete substring to be matched is of interest, rather than the full
hierarchical capture information, then a pattern like:

     m/ <XML_file> /

(which presumably does a large amount of hierarchical capturing and
returns a very complex set of nested C<Match> objects), could be rewritten:

     m/ $<XML_str>:=[«XML_file»] /

instead. Here the C<< <XML_file> >> subrule is called using double brackets
instead, which calls it as a non-capturing subrule. It still matches the same
substring, of course, which is then captured by the C<< $<XML_str>:= >> alias.

Note too that, because a subrule call like C<«XML_file»> is a bracketed
non-capturing construct, it obeys the rules for C<[...]> (as described in
L<Named scalar aliases applied to non-capturing brackets>), so the above
optimization could just be written:

     m/ $<XML_str>:=«XML_file» /


=head3 Named scalar aliasing to subrules

An unaliased capturing subrule assigns its C<Match> object to the hash
entry whose key is the name of the subrule:

     if m:/ ID\: <ident> / {
         say "Identified as $/<ident>";
     }

But if a subrule is aliased, it assigns its C<Match> object to the hash entry
whose key is the name of the alias instead. And, more importantly, it
I<doesn't> assign anything to the hash entry whose key is the subrule
name. That is:

     if m:/ ID\: $<id>:=<ident> / {
         say "Identified as $/<id>";    # and $/<ident> is undefined
     }

Hence aliasing a subrule I<changes> the destination of the subrule's C<Match>
object. This is particulatly useful for differentiating two or more calls to
the same subrule in the same scope. For example:

     if m:w/ mv <file> $<dir>:=<file> / {
         $from = $<file>;
         $to   = $<dir>;
     }

In this example, the final match of the C<< <file> >> subrule is not appended
onto an array in C<< $/<file> >>, but is assigned to the hash element
corresponding to the alias name: C<< $/<dir> >>.


=head3 Numbered scalar aliasing

If a numbered alias is used instead of a named alias:

     m/ $2:=(<-[:]>*) \:  $1:=<ident> /

the behaviour is exactly the same as for a named alias, except that the
resulting C<Match> object is assigned to the corresponding element of
the appropriate array, rather than to an element of the hash.

For example:

     m:w/ $1:=[ (<[A-E]>) (\d**{3..6}) (X?) ] /;
     # $/[0] contains a match object storing the complete substring
     # matched by the square brackets


     m:w/ $2:=( (<[A-E]>) (\d**{3..6}) (X?) ) /;
     # $/[1] contains the match object returned by the outer subpattern


     if m:/ ID\: $3:=<ident> / {
         say "Identified as $3";    # and $/<ident> is undefined
     }

The only addition behaviour is that, if any numbered alias is used, the
numbering of subsequent unaliased subpatterns in the same scope automatically
increments from that alias number (much like enum values increment from
the last explicit value). That is:

      #  ---$2---    -$3-    ---$7---    -$8-
      # |        |  |    |  |        |  |    |
     m/ $2:=(food)  (bard)  $7:=(bazd)  (quxd) /;


This behaviour is particularly useful for reinstituting Perl5 semantics
for consecutive subpattern numbering in alternations:

     $tune_up6 = rx/ (don't) (ray) (me) (for) (solar tea), (d'oh!)
                   | $7:=(every) (green) (BEM) (devours) (faces)
                   #             $8      $9    $10       $11
                   /;

It also provides an easy way in Perl 6 to reinstitute the unnested
numbering semantics of nested Perl 5 subpatterns:

      # Perl 5...
      #               $1
      #  _____________/\______________
      # |    $2          $3       $4  |
      # |  __/\___   ____/\____   /\  |
      # | |       | |          | |  | |
     m/ ( (<[A-E]>) (\d**{3..6}) (X?) ) /;


      # Perl 6...
      #               $1
      #  _____________/\______________
      # |  $1[0]       $1[1]    $1[2] |
      # |  __/\___   ____/\____   /\  |
      # | |       | |          | |  | |
     m/ ( (<[A-E]>) (\d**{3..6}) (X?) ) /;


      # Perl 6 simulating Perl 5...
      #                 $1
      #  _______________/\________________
      # |        $2          $3       $4  |
      # |      __/\___   ____/\____   /\  |
      # |     |       | |          | |  | |
     m/ $1:=[ (<[A-E]>) (\d**{3..6}) (X?) ] /;

The non-capturing brackets don't introduce a scope, so the subpatterns within
them are at rule scope, and hence numbered at the top level. Aliasing the
square brackets to C<$1> means that the next subpattern at the same level
(i.e. the C<< (<[A-E]>) >>) is numbered sequentially (i.e. C<$2>), etc.


=head3 Scalar aliases applied to quantified constructs

All of the above semantics apply equally to aliases which are applied to
quantified structures. The only difference is that, if the aliased construct
is a subrule or subpattern, that quantified subrule or subpattern will have
returned an array of C<Match> objects (as described in L<Quantified
subpattern captures> and L<Repeated captures of the same subrule>). So
the corresponding array element or hash entry for the alias will contain
an array reference instead of a single C<Match> object. Hence aliasing
and quantification are completely orthogonal.

For example:

     if m/ mv $<from>:=<file>+ / {
         # <from>+ returns an array of Match objects,
         # so $/<from> contains array of Match objects,
         # one for each successful call to <file>

         # $/<file> does not exist (pre-empted by the alias)
     }


     if m/ mv $<from>:=(\S+ \s+)+ / {
         # Quantified subpattern returns an aray of Match objects, so
         # $/<from> contains array of Match objects,
         # one for each successful match of the subpattern

         # $/[0] does not exist (pre-empted by the alias)
     }

A set of quantified I<non-capturing> brackets always returns a
single C<Match> object which contains only the complete substring
that was matched by the full set of repetitions of the brackets (as
described in L<Named scalar aliases applied to non-capturing brackets>).

So, if an alias is applied to a set of quantified I<non-capturing>
brackets, the corresponding array element or hash entry for the alias
will be assigned that single C<Match> object. For example:

     "coffee fifo fumble" ~~ m/ .*? $<effs>:=[f <-[f]>**{1..2} \s*]+ /;

     say $<effs>;    # prints "fee fifo fum"


=head3 Array aliasing

An alias can also be specified using an array as the alias instead of scalar.
For example:

     m/ mv @<from>:=[(\S+) \s+]* <dir> /;

Using the C<< @<alias>:= >> notation instead of a C<< $<alias>:= >> has
several effects. The first is that the corresponding hash entry or array
element I<always> receives an array of C<Match> objects, even if the
construct being aliased would normally return a single C<Match>
object. That is:

     m/ $<names>:=<ident> /;      # $/<names> assigned a single Match object

     m/ @<names>:=<ident> /;      # $/<names> assigned an array which contains
                                  # a single Match object

This is useful for creating consistent capture semantics across structurally
different alternations (by enforcing array captures in all branches):

     m:w/ Mr?s? $<names>:=<ident> W\. $<names>:=<ident>
        | Mr?s? @<names>:=<ident>
        /;

     say "name: @{$<names>}";

If an array alias is applied to a quantified pair of non-capturing
brackets, it captures the substrings matched by each repetition of the
brackets into separate elements of the corresponding array. That is:

     m/ mv $<files>:=[ f.. \s* ]* /;     # $<files> assigned a single Match
                                         # object containing the
                                         # complete substring matched by
                                         # the full set of repetitions
                                         # of the non-capturing brackets

     m/ mv @<files>:=[ f.. \s* ]* /;     # $<files> assigned an array, each
                                         # element of which is a C<Match>
                                         # object containing the substring
                                         # matched by Nth repetition of
                                         # the non-capturing bracket match


If an array alias is applied to a quantified pair of capturing parens
(i.e. to a subpattern), then the corresponding hash or array element is
assigned a list constructed by concatenating the array values of each
C<Match> object returned by one repetition of the subpattern. That is,
an array alias on a subpattern flattens and collects all nested
subpattern captures within the aliased subpattern. For example:

     if m:w/ $<pairs>:=( (\w+) \: (\N+) ) / {

         # Scalar alias, so $/<pairs> contains an array of Match objects,
         # each of which has its own array of two subcaptures...

         for @{$<pairs>} => $pair {
             say "Key: $pair[0]";
             say "Val: $pair[1]";
         }
     }


     if m:w/ @<pairs>:=( (\w+) \: (\N+) ) / {
         # Array alias, so $/<pairs> contains an array of Match objects,
         # each of which is one of the two subcaptures within the
         # subpattern, all flattened back into the outer array...

         for @{$<pairs>} => $key, $val {
             say "Key: $key";
             say "Val: $val";
         }
     }

Likewise, if an array alias is applied to a quantified subrule, then the
hash or array element corresponding to the alias is assigned a list
containing the array values of each C<Match> object returned by each
repetition of the subrule, all flattened into a single array. That is,
an array alias on a subrule flattens and collects all the subpattern
captures that occurred within the aliased subrule. For example:

     rule pair :w { (\w+) \: (\N+) }

     if m:w/ $<pairs>:=<pair>+ / {
         # Scalar alias, so $/<pairs> contains an array of Match objects,
         # each of which is the result of the <pair> subrule call...

         for @{$<pairs>} => $pair {
             say "Key: $pair[0]";
             say "Val: $pair[1]";
         }
     }


     if m:w/ mv @<pairs>:=<pair>+ / {
         # Array alias, so $/<pairs> contains an array of Match objects,
         # each of which is one of the captures that occurred within the
         # subrule, flattened back into the outer array...

         for @{$<pairs>} => $key, $val {
             say "Key: $key";
             say "Val: $val";
         }
     }

In other words, an array alias is useful to flatten into a single array
any nested captures that might occur within a repeated subpattern or subrule.
Whereas a scalar alias is useful to preserve (within a top-level array)
the internal structure of each repetition.

Note that, outside a rule, C<< @<foo> >> is simply a shorthand for
C<< @{$<foo>} >>, so the above C<for> loop could also have been written:

         for @<pairs> => $key, $val {
             say "Key: $key";
             say "Val: $val";
         }


It is also possible to use a numbered variable as an array alias.
The semantics are exactly as described above, with the sole difference
being that the resulting array of C<Match> objects is assigned into the
appropriate element of the rule's match array, rather than to a key of
its match hash. For example:

     if m/ mv  \s+  @1:=((\w+) \s+)+  $2:=(\w+) / {
         #          |                 |
         #          |                 |
         #          |                  \___ Scalar alias, so $2 as normal
         #          |
         #           \___ Array alias, so $1 assigned a flattened array
         #                of just the (\w+) captures from each repetition

         @from = @{$1};
         $to   = $2;
     }

Note that, outside a rule, C<@1> is simply a shorthand for C<@{$1}>, so the
first assignment above could also have been written:

         @from = @1;


=head3 Hash aliasing

An alias can also be specified using a hash as the alias variable,
instead of scalar or array. For example:

     m:w/ mv %<location>:=( (<ident>) \: (\N+) )+ /;

A hash alias causes the correponding hash or array element in the
current scope's C<Match> object to be assigned a hash (rather than an
array or a single C<Match> object).

A hash alias cannot be applied to a quantified pair of non-capturing brackets.
Attempting to do so is a compile-time detectable error.

If a hash alias is applied to a pair of capturing parens (i.e. to
a subpattern), then the corresponding hash or array element is assigned a
hash. Each entry in that hash is constructed as follows:

=over

=item 1.

If the subpattern was unquantified, take the single C<Match> object it returns
and place it in an array. If the subpattern was quantified, take the array of
C<Match> objects it returns. Then, for each C<Match> object in the array...

=over

=item 1a.

Evaluate that C<Match> object as an array to produce a list.

=item 1b.

Use the first element of the list as the next key.

=item 1c.

Use the remaining element(s) of the list as the corresponding value(s).
If there are no remaining elements, the value is C<undef>.
If there is one remaining element, the value is that element.
If there are two or more remaining elements, the value is a reference to an
array containing those elements.

=back

=back

In other words, if a hash alias is applied to a subpattern, the first
pair of capturing parens within the subpattern provides the hash keys,
and the remaining capturing parens (if any) provide the corresponding
values. If the subpattern is unquantified then the resulting hash will
have only a single key; if the subpattern is quantified, the hash may
have multiple keys. For example:

         #                key      val
         #                _^_      _^_
         #               |   |    |   |
     if m:w/ %<pairs>:=( (\w+) \: (\N+) )+ / {

         # Hash alias, so $/<pairs> contains a hash, in which each key is
         # provided by the first subcapture and each value is provided by
         # the second...

         for %{$/<pairs>} -> $pair {  # Hash in list context produces pairs
             say "Key: $pair.key";
             say "Val: $pair.value";
         }
     }

If there are three or more captures within the aliased subpattern, the
second and subsequent captures are converted to an array:

         #                   key     val[0] val[1] val[2]
         #                   _^_      _^_    _^_    _^_
         #                  |   |    |   |  |   |  |   |
     if m:w/ %<synonyms>:=( (\w+) \: (\S+)  (\S+)  (\S+) )+ / {

         # $/<synonyms> contains a hash, in which each key is provided by
         # the first subcapture and each value is an array containing the
         # second, third, and fourth subcaptures...

         for %{$/<synonyms>} => $syn {
             say "Key:  $syn.key";
             say "Vals: @{$syn.value}";
         }
     }

Note that, outside a rule, C<< %<foo> >> is a shortcut for C<< %{$/<foo>} >>,
so the previous C<for> loop could equally well have been written:

         for %<synonyms> => $syn {
             say "Key:  $syn.key";
             say "Vals: @{$syn.value}";
         }


If a hash alias is applied to a subrule, then the corresponding hash or
array element is once again assigned a hash. Each entry in that hash is
constructed in exactly the same way as for a hash-aliased subpattern.

That is, the first subpattern capture within the subrule is used as each
key, and the remaining subpattern captures are used as the corresponding
values. For example:

     rule one_to_one :w { (\w+) \: (\N+) }

     if m:w/ %<pairs>:=<one_to_one>+ / {

         # Hash alias, so $/<pairs> contains a hash, in which each key is
         # provided by the first subcapture in <one_to_one> and each
         # value is provided by the second subcapture within the
         # subrule...

         for %<pairs> -> $pair {
             say "One: $pair.key";
             say "One: $pair.value";
         }
     }

Likewise, if the subrule captures more than two subpatterns:

     rule one_to_many :w {  (\w+) \: (\S+) (\S+) (\S+) }

     if m:w/ %<synonyms>:=<one_to_many>+ / {

         # Hash alias, so $/<pairs> contains a hash, in which each key is
         # provided by the first subcapture within C<one_to_many>, and
         # each value is an array containing the subrule's second, third,
         # and fourth subcaptures...

         for %<pairs> -> $pair {
             say "One:  $pair.key";
             say "Many: @{$pair.value}";
         }
     }


As with array aliases, it is also possible to use a numbered variable as
a hash alias. Once again, the only difference is where the resulting
C<Match> object is stored:

     rule one_to_many :w {  (\w+) \: (\S+) (\S+) (\S+) }

     if m:w/ %1:=<one_to_many>+ / {
         # $/[0] contains a hash, in which each key is provided by the
         # first subcapture within C<one_to_many>, and each value is an
         # array containing the subrule's second, third, and fourth
         # subcaptures...

         for %{$/[0]} -> $pair {
             say "One:  $pair.key";
             say "Many: @{$pair.value}";
         }
     }

And, of course, outside the rule, C<%1> is a shortcut for C<%{$1}>:

         for %1 => $pair {
             say "One:  $pair.key";
             say "Many: @{$pair.value}";
         }


=head3 External aliasing

As a final alternative, instead of using internal aliases like:

     m/ mv  @<files>:=<ident>+  $<dir>:=<ident> /

the name of an ordinary variable can be used as an "external alias", like so:

     m/ mv  @files:=<ident>+  $dir:=<ident> /

In this case, the behaviour of each alias is exactly as described in the
previous sections, except that the resulting capture(s) are assigned
directly to the variables of the specified name that exist in the scope
in which the rule declared. For example:

     if m/ mv  @files:=[ <ident> ]+  $dir:=<ident> / {
         say "From: @files";
         say "  To: $dir";
     }

Note that, because they bind statically to variables in the
I<declaration> scope, not dynamically to variables in the I<calling>
scope, external aliases are generally best used only in ad hoc pattern
matches like the one shown above. It is generally a Very Bad Idea to use
external aliases in a named rule. That's because, if that rule is
subsequently used as a subrule within a pattern match, the external
aliases will assign to variables in the scope where the rule was
I<declared>, not the scope in which it was I<used> as a subrule. For example:

     grammar Shell::Commands {
         rule mv { mv  @files:=[ <ident> ]+  $dir:=<ident> }
     }

     if m/<Shell::Commands.mv>/ {
         say "From: @files";         # Bzzzt! @Shell::Commands::files was set
         say "  To: $dir";           # Bzzzt! @Shell::Commands::dir was set
     }

Internal aliases are a far better choice in such cases, unless you truly
want the subtle cross-scoping effect that is achieved:

     grammar Shell::Commands {

         my $lastcmd;

         rule cmd { $/:=<mv> | $/:=<cp> }

         rule mv { $lastcmd:=(mv)  $<files>:=[ <ident> ]+  $<dir>:=<ident> }
         rule cp { $lastcmd:=(cp)  $<files>:=[ <ident> ]+  $<dir>:=<ident> }

         sub lastcmd { return $lastcmd }
     }

     while shift ~~ m/<Shell::Commands.cmd>/ {
         say "From: @{$<files>}";
         say "  To: $<dir>";
     }

     say "Final command was { Shell::Commands::lastcmd() }";



=head2 The C<:parsetree> flag

Normally, subrule calls capture by name to a hash entry of the scope's
C<Match> object, whilst subpatterns capture positionally to that object's
array element. Usually that's sufficient, since most coders only want to
access captures either sequentially (in which case they use subpatterns)
or symbolically (in which case the use subrules).

But a small number of implementers -- predominantly the writers of
compilers, translaters, code browsers, refactoring tools, etc.) need to
know both the order in which parts of a rule match I<and> the symbolic
names of those parts.

To support that, Perl 6 rules and matches can be specified with a
special flag: C<:parsetree>. Under this flag the capture behaviour of both
subpatterns and subrules alters from that described in the preceding sections.

Under C<:parsetree> the C<Match> objects generated by successful
subpatterns are still captured into the array of the surrounding scope's
C<Match> object, but now those objects not actually instances of class
C<Match>. Instead, they are blessed into a class derived from C<Match>:
C<Match::Subpattern>.

     if ( m:parsetree/ (Volume\:) (\d+) / ) {
         for @{$/}.kv -> $i, $cap {
             when Match::Subpattern {
                 say "Node $i is a subpattern."
                 say "It captured: '$cap'";
             }
             say "";
         }
     }

which might print:

     Node 0 is a subpattern.
     It captured: 'Volume:'

     Node 1 is a subpattern.
     It captured: '11'


Under C<:parsetree>, the behaviour of subrules is changed even more
drastically. The C<Match> objects generated by successful subrules are
no longer assigned into the hash of the surrounding scope's C<Match>
object. Instead, they are appended (like subpatterns) onto the array of
surrounding scope's C<Match> object.

Moreover, the C<:parsetree> flag overrides the exemption of C<< «name» >>
subrule calls, so they act as if they were C<< <name> >> calls instead. They
generate C<Match> objects, and those objects are also appended onto the
surrounding scope's C<Match> array.

This is true even for automagically inserted non-capturing subrules,
such as the C<«ws»> calls inserted by the C<:words> flag.

In addition, each C<Match> object returned by a subrule is now blessed
into a class derived from the C<Match::Subrule> class (which itself is
derived from the C<Match> class). The actual name of the class into which
each subrule's C<Match::Subrule> object is blessed is the same as the name
of the subrule call that generated it.

So, for example:

     if ( m:w:parsetree/ <label> <ident>/ ) {
         for @{$/}.kv -> $i, $cap {
             given $cap {
                 when Match::Subrule {
                     say "Node $i is a subrule named '$cap.class()'.";
                     say "It captured: '$cap'";
                 }
             }
             say "";
         }
     }

might print somthing like:

     Node 0 is a subrule named 'ws'.
     It captured: ''

     Node 1 is a subrule named 'label'.
     It captured: 'From:'

     Node 2 is a subrule named 'ws'.
     It captured: '  '

     Node 3 is a subrule named 'ident'.
     It captured: 'postmaster'


Note that, if a rule contains both subpattern and subrule captures, they will
be interleaved in the order in which they appear in the input, and can be
dealt with polymorphically. For example:

     if ( m:w:parsetree/ (From\:) <ident>(\@\S+)/ ) {
         for @{$/}.kv -> $i, $cap {
             given ($cap) {
                 when Match::Subrule {
                     say "Node $i is a subrule named '$cap.class()'.";
                     say "It captured: '$cap'";
                 }
                 when Match::Subpattern {
                     say "Node $i is a subpattern.";
                     say "It captured: '$cap'";
                 }
             }
             say "";
         }
     }

which might print:

     Node 0 is a subrule named 'ws'.
     It captured: ''

     Node 1 is a subpattern.
     It captured: 'From:'

     Node 2 is a subrule named 'ws'.
     It captured: '  '

     Node 3 is a subrule named 'ident'.
     It captured: 'postmaster'

     Node 4 is a subpattern.
     It captured: '@perl.org'


Better still, because each C<Match>-derived object is blessed into a
particular class related to the subpattern or rule that created it, it's
easy to create handlers in those classes and make the processing fully
polymorphic (and far more specific):

     method Match::Subpattern::describe ($self: $index) {
         say "Node $index is a subpattern that matched: '$self'";
     }

     method ws::describe ($self: $index) {
         say "Node $index is the whitespace: '$self'";
     }

     method ident::describe ($self: $index) {
         say "Node $index is the identifier: '$self'.";
     }


     if ( m:w:parsetree/ (From\:) <ident>(\@\S+)/ ) {
         my $i = 0;
         .describe($i++) for @{$/};
     }


which might then print:

     Node 0 is the whitespace: ''
     Node 1 is a subpattern that matched: 'From:'
     Node 2 is the whitespace: '  '
     Node 3 is the identifier: 'postmaster'
     Node 4 is a subpattern that matched: '@perl.org'

One final feature of the C<:parsetree> flag is that it automatically
propagates to every subrule that a C<:parsetree>'d rule calls. And, from
there, recursively into any subrules that those subrules call. Et
cetera. Note that this will almost certainly require a one-time
recompilation of those subrules, unless they had originally been
specified with C<:parsetree> themselves, but that will be entirely
transparent to the user.

This propagation of the C<:parsetree> flag means that the C<Match> objects
returned by subrules will contain arrays with the same linearized,
objectified contents. Effectively, a C<:parsetree>'s rule will return an
array of arrays of arrays etc. corresponding to the hierarchical
structure of the data that the rule matched.

Which opens up the possibility of processing that data both
polymorphically I<and> hierarchically. For example, if we added:

     # Factor out the ugly mail address matching...
     rule mailaddr { <ident> \@ (\S+) }

     # And specify how to describe the resulting data structure...
     method mailaddr::describe ($self: $index) {
         say "Node $index is a mail address, which consists of:";
         my $subindex = 0;
         temp wrap say { call "\t", @_ }  # Indent when describing the bits...
         .describe($index~'.'~$subindex++) for @{$self};
     }

then we could update our original pattern match:

     if ( m:w:parsetree/ (From\:) <mailaddr>/ ) {
         my $i = 0;
         .describe($i++) for @{$/};
     }

The resulting syntax tree would now describe itself hierarchically:

     Node 0 is the whitespace: ''
     Node 1 is a subpattern that matched: 'From:'
     Node 2 is the whitespace: '  '
     Node 3 is a mail address, which consists of:
         Node 3.1 is the identifier: 'postmaster'
         Node 3.2 is a subpattern that matched: '@perl.org'


=head2 Capturing from non-singular matches

=head3 Matching under the C<:x> and C<:g> flags

When an entire rule is successfully matched with repetitions
(specified via the C<:x> and C<:g> flags), it often produces a series
of distinct matches.

However, a successful match under the these flags still returns a single
C<Match> object in C<$/>. But the values of this match object are slightly
different from a "one-ping-only" match:

=over

=item *

The boolean value of C<$/> after such matches is true or false, depending on
whether the pattern matched at all.

=item *

The integer value is the number of times the pattern matched.

=item *

The string value is the substring from the start of the first match to
the end of the last match (I<including> any intervening parts of the
string that the rule skipped over in order to find later matches).

=item *

There are no array contents or hash entries.

=back

For example:

     if $text ~~ m:words:globally/ (\S+:) <rocks> / {
         say "Matched {+$/} different ways";

         say 'Full match context is:';
         say $/;
     }

The list of individual match objects corresponding to each separate
match is also available via the C<.matches> method. For example:

     if $text ~~ m:words:globally/ (\S+:) <rocks> / {
         for $/.matches -> $m {
             say "Match between $m.from() and { $m.to()-1 }";
             say 'Right on, dude!' if $m[0] eq 'Perl';
             say "Rocks like $m<rocks>";
         }
     }


=head3 Matching under the C<:overlap> and C<:exhaustive> flags

Unlike the multiple matches of the C<:x> and C<:g> flags, success under
the C<:overlap> and C<:exhaustive> flags doesn't necessarily produce
a sequence of disjoint matches, but rather a disjunction of
alternative matches.

A successful match under the C<:overlap> or C<:exhaustive> flags still
returns a single C<Match> object in C<$/> (all matches do) and the C<.matches>
method of this object still returns all the distinct C<Match> objects for each
alternative match (in the order the matches were found).

But the values of the top-level C<Match> object returned by an overlapping  or
exhaustive match are unusual:

=over

=item *

The boolean value of C<$/> after such matches is true or false, depending on
whether the pattern matched at all.

=item *

The integer value is the number of distinct ways in which the pattern matched.

=item *

The string value is a disjunction of all the distinct matches.

=item *

The array contents are a list of disjunctions of all the corresponding
unnamed captures from all the distinct matches. That is, C<$1> is a
disjunction of the C<$1> value of each of the successful matches that sets a
C<$1>.

=item *

The hash values are disjunctions of all the corresponding
named captures from all the distinct matches. That is, C<< $<foo> >> is
a disjunction of the C<< $<foo> >> value of each of the successful matches
that sets a C< $<foo> >>.

=back

For example:

if $text ~~ m:words:exhaustive/ (\S+:) <rocks> / {
     say "Matched {+$/} different ways";

     say 'Right on, dude!' if $1 eq 'Perl';   # Disjunctive match against
                                              # all possible $1's from
                                              # any of the exhaustive matches

     say 'Found these variations on "rocks":';
     say for $<rocks>.values;                 # List all possible substrings
                                              # successfully matched by <rocks>
                                              # in any of the exhaustive matches
}

As mentioned above, the individual match objects for each alternative
match are also available (in canonical order) via the C<.matches>
method. For example:

if $text ~~ m:words:exhaustive/ (\S+:) <rocks> / {
     for $/.matches -> $m {
         say 'Right on, dude!' if $m[0] eq 'Perl';   # Normal match against
                                                     # match $m's $1's

         say "Rocks like $m<rocks>";          # Substring matched by <rocks>
                                              # in match $m
     }
}



=head2 Executive summary of proposed changes

=over

=item *

Angles create subrules, which return a C<Match> object that is
captured into the hash of their surrounding scope's C<Match> object.

=item *

Parens create subpatterns, which return a C<Match> object that is
captured into the array of their surrounding scope's C<Match> object.

=item *

A subpattern is like an inlined subrule (except that it captures
into an array, rather than a hash).

=item *

Subpatterns nest lexically, and the captures they return are likewise
hierarchical.

=item *

The number associated with a subpattern reflects its ordinal position in its
immediately surrounding scope, not its ordinal position in the overall rule.
As a result, these numbers are hierarchical, rather than linear.

=item *

Quantifiers (except C<?> and C<??>) cause a matched subrule or subpattern to
return an array of C<Match> objects, instead of just a single object.

=item *

Two or more calls to the same subrule or subpattern in the same lexical scope
also cause the matched subrules/subpatterns to accumulate their C<Match>
objects in an array.

=item *

Scalar aliases rename or renumber the construct they're applied to, changing
the location in which the construct's C<Match> object's is stored, but not its
captuing semantics.

=item *

Array aliases rename or renumber the construct they're applied to, and also
cause its corresponding C<Match> object(s) always to be returned in an array.

=item *

The elements of that array are a flattened list of the C<Match> objects
returned by the subpatterns nested inside the aliased construct.

=item *

Hash aliases rename or renumber the construct they're applied to, and also
cause its corresponding C<Match> object(s) always to be returned in a hash.

=item *

The keys of this hash are C<Match> objects returned by the the first
subpattern nested inside the aliased construct. The values are the C<Match>
objects returned by the remaining nested subpatterns.

=item *

The C<:parsetree> flag modifies capture semantics to preserve the parse
sequence, the identity information, and the hierarchical structure of
captures, whilst also supporting object-oriented processing of the
resulting parse tree.

=back


Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About