develooper Front page | perl.perl5.porters | Postings from September 2014

Re: RFC: implementing script runs

Thread Previous | Thread Next
D Perrett
September 25, 2014 10:38
Re: RFC: implementing script runs
Message ID:
I can easily imagine wanting to do complex things within the script
run (e.g. SCRIPT+NONSCRIPT+SCRIPT where both are the same so a char
class or quantifier wouldn't work). A grouping - as in like (?: ) -
would make sense to me. Ideally you'd be able to restrict the scripts
permitted, as well.

"I'm leaning to including contiguous Common ones as well, but am less certain."
Given a string whose internal representation is ARABIC ( ENGLISH )
ARABIC, we expect it to come out CIBARA ( ENGLISH ) CIBARA. However if
it'a hard/impossible to distinguish between the Arabic proper and the
brackets you're likely to end up with CIBARA ) ENGLISH ( CIBARA.

On 25 September 2014 10:21, demerphq <> wrote:
> On 25 September 2014 07:17, Karl Williamson <> wrote:
>> Unicode defines a "script run" to be contiguous characters from the same
>> script, like all Latin or all Greek.
>> These can be important for security.  See
>> It seems to me that Perl should offer an easy way to specify that a regex
>> pattern element should match only a script run.  I'm proposing the only
>> current illegal syntax that is easy to type that I'm aware of; other
>> suggestions welcome.
>> The idea I had is to have an extra '*' following the quantifier mean to
>> use a script run.  For example, qr/\w+*/ would match all the consecutive
>> word characters that are in the same script as the first one found.
> I like the general idea of being able to do this, but using a quantifier
> modifier to enable it doesn't seem a good fit. Do you have any other
> proposals for implementing it? Some kind of new character class syntax
> maybe? A pseudo POSIX style character class maybe?
> Yves
> --
> perl -Mre=debug -e "/just|another|perl|hacker/"

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About