develooper Front page | perl.perl5.porters | Postings from July 2016

Re: RFC: seeking syntax for allowing script run pattern matching

Thread Previous | Thread Next
From:
hv
Date:
July 6, 2016 21:17
Subject:
Re: RFC: seeking syntax for allowing script run pattern matching
Message ID:
201607062002.u66K2SP20630@crypt.org
Karl Williamson <public@khwilliamson.com> wrote:
:A script run is a sequence of characters, all from the same script, such 
:as Latin or Greek. [...]
:I'm looking for some more ideas.

It feels like something that should apply over a scope in a pattern, with
affordance for applying it to a whole pattern - we have exactly that
concept with the flags and /(?f:...)/ construct.

That implies it should be possible to say, using //S as a placeholder name,
something like m{\w+ \w+}S to ask for two words separated by a space, with
all the letters coming from a single script.

That also implies it can be locally disabled:
  /(?S:\w+ (?-S:\w) \w+)/
===
  my $letter = qr{\w};
  /\w+ $letter \w+/S;

We occasionally see bugs caused by misunderstanding of how flags act on
interpolated patterns, but consistency with other existing behaviours
seems desirable for all that.

That leaves interesting questions of how the following should behave:
  /(?S:\w (?S:\w) \w)/
and
  /(?S:\w (?-S:. (?S:\w+) .) \w)/

I think the first (where +S is introduced when it is already active)
should be a noop - the same script should still be required.

I think the second (where +S is introduced in a -S scope, itself within
a +S scope) should permit a new script.

I wonder whether the next request will be a variant that overrides /./
to be equivalent to /(?S:\W|\w)/.

Should the proposal also affect uses of \w inside character classes?

Hugo

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About