Front page | perl.perl5.porters |
Postings from October 2017
RFC: \w{Latin|Greek}
From:
Karl Williamson
Date:
October 18, 2017 17:00
Subject:
RFC: \w{Latin|Greek}
Message ID:
41f7de36-9e39-a60e-15ac-baf0cb43efd4@khwilliamson.com
Unescaped left brace is available to use in 5.30 (after a long
deprecation cycle) when it appears after a "\ :alpha:" sequence. This
allows the actual implementation of one of the earliest proposals for
this capability, and is described in this email.
\w{Latin|Greek} would match only those \w characters that are in the
Latin or Greek scripts. It is currently already possible to do this,
but more clunkily:
(?[ \w & ( \p{Latin} | \p{Greek} ])
In principal, what's in the braces need not be just a script name. It
could be any Unicode binary property
\d{nv=2}
would choose the decimal digits that are equivalent to '2'. These
include the familiar ASCII one, but also Bengali, Thai, .... Saying
just \p{nv=2} doesn't do the same thing, as it would include things like
a superscript 2 that aren't real digits.
-
RFC: \w{Latin|Greek}
by Karl Williamson