develooper Front page | perl.perl5.porters | Postings from October 2021

Re: Regular expressions store capture call and undef operator

Thread Previous
From:
sasho648
Date:
October 2, 2021 12:46
Subject:
Re: Regular expressions store capture call and undef operator
Message ID:
CACcOKpYkF2EFkbEetQAh64+rwbpASPp7CFavEm4rCQ_x9_jKCw@mail.gmail.com
I've recently found that the opposite may also be needed - i.e. declare a
capture group to act like a subroutine and delete all the matches upon
return.

Thanks so much for your time

On Tue, Sep 21, 2021 at 10:21 AM sasho648 <sasho648@gmail.com> wrote:

> Hello,
>
> So I'm implementing a C compiler with Perl regular expressions and mid
> pattern code execution - part of it's working rely on having the captured
> named groups available at the point of the code in (?{ }), however
> currently a subroutine call, when returns it destroys the capture context.
> Currently I can just embed the pattern directly instead of issuing a
> subroutine call but it would be cool if I could avoid duplication.
>
> The other issue is that I'm matching out of order to feed in order
> information to the compiler backend but for that I need to create what I
> call 'facets' - copies of the same pattern but without code calls in order
> to fill the match when backtracking. The same motivation as above is coming
> into play here - I want to avoid code duplication.
>
> I'm proposing non destructive subroutine calls with the syntax
>
> (?&&sub)
>
> Which will have the same behavior if the subroutine body is embedded as
> text inside the pattern.
>
> So if we have:
>
> (?<sub>sometext)
>
> (?&&sub) will be an alias to the above and so:
>
> (?&&sub)\g{sub}
>
> will work and match sometext twice.
>
> Also the same behavior will apply recursively to any named group define
> inside the subroutine but it won't apply to destructive (aka normal)
> subroutine calls:
>
> (?<sub>some(?<text>text))
>
> (?&&sub)\g{text}
>
> will match sometexttext
>
> But if we have like:
>
> (?<sub>some(?&text))
>
> the capture 'text' (as it's currently) won't exist in the caller.
>
> For the second part of this proposal I suggest (*UNDEF:name) verb. Used
> like this:
>
> (?<sub>some(?<text>text)(?(<facet>)|(?{someperlsub($+{text})})))
> (?<facet>)(?#disable code calls)(?&sub)(*UNDEF:facet)(?#enable code calls
> back)(?&sub)
>
> Which will instance someperlsub only a single time.
>
> The benefits of this syntax is easier parsing of complex structures (like
> the C programming language) with plain Regular Expressions.
>
> Potential issues at least with the first part of this proposal are the
> possible clogging of memory but I feel like if implemented correctly this
> issue could be avoided completely.
>
> Thanks so much in advance,
>
> Alexander Nikolov
>

Thread Previous


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About