syntax proposal for matching balanced strings

David Nicol
February 13, 2008 12:14
syntax proposal for matching balanced strings
On Feb 13, 2008 1:51 PM, David Nicol <> wrote:

>    my @EltList = $elt_doc =~
>    qr{
>         (?[]:'<\s*(\S+)\s*([^>]*)\s*>'\R'</\s*\1\s*>')
>         |
>         (?[]:'<\s*(\S+)\s*([^>]*)\s*/>')
>    }gx;

sorry that was based on a draft idea before separating ?[]: and \R
this might be more correct

   my @EltList = $elt_doc =~

although there are open questions about interaction of | and capturing.
Would it be possible to make (?[]:) aware of when it is one of a group
of alternatives and suppress the non-matching captures?  Or only
allow (?[  your regex here ])  to -- hmm -- that's even better

   my @EltList = $elt_doc =~

(?[ your regex here ])  would capture to exactly one array-ref
containing the captures from the enclosed matching option, and
take modifiers after the closing square bracket.  On match failure,
for instance when used as a failed option, is still an empty arrayref
rather than empty-string (debatable.)

\R means, match this regex recursively if possible.
