develooper Front page | perl.perl5.porters | Postings from December 2017

Re: Implementing script runs

Thread Previous | Thread Next
Karl Williamson
December 18, 2017 21:26
Re: Implementing script runs
Message ID:
On 11/05/2017 12:19 PM, Zefram wrote:
> Father Chrysostomos wrote:
>>     (?+extended_modiifer_1,extended_modifier_2:)
>>     (?mix+script_run:...)
> I like this syntax.  I wonder how it would work with the "-" for turning
> modifiers off.
> However, as we discussed last year, this is semantically wrong
> for script runs.  The modifiers that we have so far affect the
> interpretation of each part of the affected subpattern individually,
> such that /(?foo:bar)(?foo:baz)/ is always equivalent to /(?foo:barbaz)/.
> This holds even in the /i cases that mess with character boundaries,
> such as "\xdf" =~ /(?i:s)(?i:s)/.  The script run feature is completely
> unlike these: it's about the string matching the subpattern *as a whole*,
> and the concatenation of two script-run subpatterns does not behave like
> a single script-run subpattern.
> So I think a different syntax is required for script runs.  We already
> have the "(*WORD)" syntax to identify extended regexp features by keyword,
> so I think "(*script_run:...)" is a good way to go.
> -zefram

I have implemented it as (*WORD: ...)
but I think there is a better syntax.  The docs say this syntax is for 
backtracking verbs like PRUNE, and the existing implementation is based 
on that assumption.  I had to make an exception for this unrelated purpose.

I agree that a modifier is not the correct way to go, but we have other 

This feature is really a zero-length assertion around the enclosed 
pattern.  In action is is most like the possessive or atomic construct


And so, it could be specified using a syntax like this.  Of the few 
available characters that could be used instead of '>', I like these the 


But another option is to do


since we already have things like (?P>name)

I like this because I think it could be used to get more meaningful 
names for the other zero-length assertions.  I don't use them often 
enough to remember them, and always have to look them up in the docs. 
Each time, I say, yeah, that makes sense, but I still can't remember 
them.  At some point we could say


My guess is that script runs will usually be combined with the atomic 
construct, so we could have


to be a shortcut for the combination.

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About