develooper Front page | perl.perl5.porters | Postings from September 2021

RFC naming the 0th match in regular expressions

Thread Next
From:
Roy via perl5-porters
Date:
September 9, 2021 01:43
Subject:
RFC naming the 0th match in regular expressions
Message ID:
96ba361b-8fa4-46bb-7a58-f13a3f8aba1f@devo.net.au
Hi all,

I brought this up on PCRE2 (at 
<https://github.com/PhilipHazel/pcre2/issues/15>), but the author 
rightfully pointed out that it's something Perl doesn't do.

 1. The (minor) problem is that a regular expression cannot begin
    directly with a naming group, it must be parenthesised. This makes
    expressions that use named patterns produce result sets that are 1
    item longer than necessary, and contain a duplicate member (0 and
    1). It is more pronounced when using the "g" modifier, because
    iterating the matches then has multiple 0 groups that cannot be
    addressed using a semantic name.
 2. The proposed syntax is to allow a pattern like
    /?<name>.../
    rather than requiring
    /(?<name>...)/
 3. The benefits are consistent semantic access of match result members,
    and smaller result sets. I'm sure results are already
    memory-efficient with named and numbered groups pointing to the same
    data, but they have more output when examined or used.
 4.  From basic testing with PCRE, expressions that begin with ? fail to
    compile, but I'm not certain that's the case for all regexp usage in
    Perl. It looks like a backwards-compatible change.

Regards,
Roy.


Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About