develooper Front page | perl.perl5.porters | Postings from October 2017

RFC: \w{Latin|Greek}

Karl Williamson
October 18, 2017 17:00
RFC: \w{Latin|Greek}
Message ID:
Unescaped left brace is available to use in 5.30 (after a long 
deprecation cycle) when it appears after a "\ :alpha:" sequence.  This 
allows the actual implementation of one of the earliest proposals for 
this capability, and is described in this email.

\w{Latin|Greek} would match only those \w characters that are in the 
Latin or Greek scripts.  It is currently already possible to do this, 
but more clunkily:

  (?[ \w & ( \p{Latin} | \p{Greek} ])

In principal, what's in the braces need not be just a script name.  It 
could be any Unicode binary property


would choose the decimal digits that are equivalent to '2'.  These 
include the familiar ASCII one, but also Bengali, Thai, ....  Saying 
just \p{nv=2} doesn't do the same thing, as it would include things like 
a superscript 2 that aren't real digits. Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About