develooper Front page | perl.perl5.porters | Postings from August 2013

Regexp Non-capturing flag proposal

Thread Next
From:
wolfsage
Date:
August 23, 2013 15:07
Subject:
Regexp Non-capturing flag proposal
Message ID:
CAJ0K8bitQLJO+WknVu7t1foKKwzkAaLRyzz2eu2vRPPeXYdn6g@mail.gmail.com
Howdy,

I've got a working implementation (though not fully tested) of a flag for
regexps to disable capturing to avoid the noise generated by all of the
?:'s required otherwise.

For example:

print "some string" =~ /(some|another)/n; # /n for no-capture

Will succeed but not fill in $1, copy strings, etc.

On small regexps, this is not necessarily so useful, but on larger ones
(Regexp::Common's $RE{num}{real}, this:


/(?:(?i)(?:[+-]?)(?:(?=[.]?[0123456789])(?:[0123456789]*)(?:(?:[.])(?:[0123456789]{0,}))?)(?:(?:[E])(?:(?:[+-]?)(?:[0123456789]+))|))/

Could be come this:

 /((?i)([+-]?)((?=[.]?[0123456789])([0123456789]*)(([.])([0123456789]{0,}))?)(([E])(([+-]?)([0123456789]+))|)/n

The way I've implemented it, named captures continue to work so you can
still capture only where you want to within a string. (And since as a side
affect named captures cause $1, etc, to be filled in, you can use those if
you prefer):

  "cat dog" =~ /(cat)\s(?<dog>.*)/;
  print $1; # dog
  print $+{dog}; # dog

This means you can still use back references, etc, where needed.

The problem is this only works on =~ /.../n; It doesn't yet work with qr//;
This is because as far as I can figure out, the available flags space (U8)
for flags on qr// is already used up, and doing:

  my $need_cap = qr/(.)\1/;
  print "aa" =~ /$need_cap/n;

Fails instead of working. I'd like the /n flag to work like /i instead
(apply to fragments where used).

So my question is:

Do people want this?

Or will it make reading regexs more confusing because now we'd have 2
different ways of doing captures and non-captures:

  /(cat)(?:dog)(.*)/;
  /(?<a>)(dog)(?<b)/n;

If people do want this:

1. Can we extend the flags to larger than a U8 on regnodes or will that
cause too much memory bloat?
  * If not, how else would we implement this? Or are we doomed never to add
new flags for fragments in the future?

My understanding of the code here is limited so if there are other ways to
make this work (or if this will cause problems I haven't for seen), please
let me know.

Thanks,

-- Matthew Horsfall (alh)

Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About