develooper Front page | perl.perl5.porters | Postings from July 2016

Re: RFC: seeking syntax for allowing script run pattern matching

Thread Previous | Thread Next
July 7, 2016 00:14
Re: RFC: seeking syntax for allowing script run pattern matching
Message ID:
Zefram <> wrote: wrote:
:>                                                we have exactly that
:>concept with the flags and /(?f:...)/ construct.
:It doesn't make sense as a flag in this way.  The existing flags affect
:the interpretation of individual regexp elements, so turning them on and
:off locally has obvious semantics.  But the script-run feature affects
:the meaning of a whole group.  /(?a:\w)(?a:\w)/ means the same thing
:as /(?a:\w\w)/, but /(?S:\w)(?S:\w)/ wouldn't mean the same thing as
:/(?S:\w\w)/.  It does need affordance to apply it to an arbitrary group,
:but flag syntax is the wrong way to get that.  I therefore favour the
:/(*sr:...)/ type of syntax.

Hmm, I mostly see your point, but I also expect the commonest desire
will be to apply a single sr to a whole pattern.

:>That also implies it can be locally disabled:
:>  /(?S:\w+ (?-S:\w) \w+)/
:Still doesn't make sense as a flag, but there's room for another grouping
:operator that makes its content invisible to the immediately-surrounding
:script-run constraint.  /(*sr:\w+ (*srhide:\w) \w+)/.
:>I think the first (where +S is introduced when it is already active)
:>should be a noop - the same script should still be required.
:>I think the second (where +S is introduced in a -S scope, itself within
:>a +S scope) should permit a new script.
:These are the obvious semantics for nesting (*sr:...) and (*srhide:...).

No objections to that.

:>I wonder whether the next request will be a variant that overrides /./
:>to be equivalent to /(?S:\W|\w)/.
:What's that supposed to achieve?  Any single character automatically
:satisfies the script-run constraint.

I had assumed the proposal was explicitly to affect \w. If that's not
the case, I don't understand what it would affect.

I had assumed /(*sr: \w+ x \w+ )/x would require two words sharing a
common script separated by a literal x. Are you expecting rather that
it would require the literal x also to be part of the common script?

It feels unreasonably limiting (or at least unreasonably ugly) if that
would have to be expressed instead as /(*sr: \w+ (*srhide:x) \w+ )/x.

:>Should the proposal also affect uses of \w inside character classes?
:Yes.  The script-run constraint should apply to the sequence of characters
:actually matched (except for characters hidden in a nested (*srhide:...)),
:regardless of the constructs used to match them.

That feels like an answer to the above question. It wasn't how I understood
Karl's proposal.

I have only a vague idea of the circumstances in which someone would choose
to use this, but the case I'm imagining may be common is the parsing of
user-supplied text against a grammar consisting of fixed and variable
elements, such that the variable elements are user-defined and required
to share a script but are independent of the fixed elements, which are
not chosen by the user.


Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About