develooper Front page | perl.perl5.porters | Postings from February 2019

RFC: Adding \p{foo=/re/}

Thread Next
From:
Karl Williamson
Date:
February 5, 2019 22:47
Subject:
RFC: Adding \p{foo=/re/}
Message ID:
c90b8c67-88b5-b1d0-5802-fb1e8c00e2e2@khwilliamson.com
The Unicode Technical Standard #18 on regular expressions suggests that 
Unicode properties have what I'm calling a subpattern and they call 
wildcard properties

http://www.unicode.org/reports/tr18/#Wildcard_Properties

I am proposing to implement this in 5.30.  I already have a working 
prototype, which you can find in

https://perl5.git.perl.org/perl.git/shortlog/refs/heads/smoke-me/khw-core

and play with.  Attached is a script that exercises it to create a 
pattern that matches IPV4 addresses in any language, and fails illegal 
ones.  Thus the script would work for Bengali or Thai  numbers.  The 
motivation for this came from Abigail.

Certain things aren't clear to me about how it should behave.  Should 
the default be anchored (as currently) so that you have to begin and/or 
end with '.*' to unanchor it?  I think most uses will want it anchored 
as implied by the equals sign, but that's not how other patterns behave, 
and that inconsistency probably would be too confusing.  One thing that 
might emphasize that it isn't anchored is to make them write

\p{foo=~/bar/}

(requiring a tilde)

Comments?

Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About