develooper Front page | perl.perl5.porters | Postings from February 2019

Re: RFC: Adding \p{foo=/re/}

Thread Previous | Thread Next
From:
Karl Williamson
Date:
February 6, 2019 19:46
Subject:
Re: RFC: Adding \p{foo=/re/}
Message ID:
6474e0b6-0ca8-8d58-5298-10996d9519ec@khwilliamson.com
On 2/5/19 11:27 PM, demerphq wrote:
> Fwiw, I don't like it. What happens if the pattern includes capture 
> brackets, named recursion or eval ? This seems like a way to squeeze 
> named recursion concepts into the named property functionality without 
> thinking through the ramifications.
> 
> Yves

The way it's implemented is a separate regex is compiled and executed 
during the compilation of the outer one.  Maybe you know something about 
how that could fail, but it works in my limited testing, so I'm not sure 
you're stated concerns are valid.

It calls subpattern_re = re_compile(pattern, 0);
and then pregexec(subpattern_re, ...)

> 
> On Wed, 6 Feb 2019, 06:47 Karl Williamson <public@khwilliamson.com 
> <mailto:public@khwilliamson.com> wrote:
> 
>     The Unicode Technical Standard #18 on regular expressions suggests that
>     Unicode properties have what I'm calling a subpattern and they call
>     wildcard properties
> 
>     http://www.unicode.org/reports/tr18/#Wildcard_Properties
> 
>     I am proposing to implement this in 5.30.  I already have a working
>     prototype, which you can find in
> 
>     https://perl5.git.perl.org/perl.git/shortlog/refs/heads/smoke-me/khw-core
> 
>     and play with.  Attached is a script that exercises it to create a
>     pattern that matches IPV4 addresses in any language, and fails illegal
>     ones.  Thus the script would work for Bengali or Thai  numbers.  The
>     motivation for this came from Abigail.
> 
>     Certain things aren't clear to me about how it should behave.  Should
>     the default be anchored (as currently) so that you have to begin and/or
>     end with '.*' to unanchor it?  I think most uses will want it anchored
>     as implied by the equals sign, but that's not how other patterns
>     behave,
>     and that inconsistency probably would be too confusing.  One thing that
>     might emphasize that it isn't anchored is to make them write
> 
>     \p{foo=~/bar/}
> 
>     (requiring a tilde)
> 
>     Comments?
> 

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About