develooper Front page | perl.perl5.porters | Postings from September 2014

Re: RFC: implementing script runs

Thread Previous | Thread Next
From:
D Perrett
Date:
September 25, 2014 10:38
Subject:
Re: RFC: implementing script runs
Message ID:
CABrou+kVxbk2W1hMsoVEebT6XtVe+fNf9+y2HeDhPauQFp4yxQ@mail.gmail.com
I can easily imagine wanting to do complex things within the script
run (e.g. SCRIPT+NONSCRIPT+SCRIPT where both are the same so a char
class or quantifier wouldn't work). A grouping - as in like (?: ) -
would make sense to me. Ideally you'd be able to restrict the scripts
permitted, as well.

"I'm leaning to including contiguous Common ones as well, but am less certain."
Given a string whose internal representation is ARABIC ( ENGLISH )
ARABIC, we expect it to come out CIBARA ( ENGLISH ) CIBARA. However if
it'a hard/impossible to distinguish between the Arabic proper and the
brackets you're likely to end up with CIBARA ) ENGLISH ( CIBARA.


On 25 September 2014 10:21, demerphq <demerphq@gmail.com> wrote:
> On 25 September 2014 07:17, Karl Williamson <public@khwilliamson.com> wrote:
>>
>> Unicode defines a "script run" to be contiguous characters from the same
>> script, like all Latin or all Greek.
>>
>> These can be important for security.  See
>> http://www.unicode.org/reports/tr36/
>>
>> It seems to me that Perl should offer an easy way to specify that a regex
>> pattern element should match only a script run.  I'm proposing the only
>> current illegal syntax that is easy to type that I'm aware of; other
>> suggestions welcome.
>>
>> The idea I had is to have an extra '*' following the quantifier mean to
>> use a script run.  For example, qr/\w+*/ would match all the consecutive
>> word characters that are in the same script as the first one found.
>>
>
> I like the general idea of being able to do this, but using a quantifier
> modifier to enable it doesn't seem a good fit. Do you have any other
> proposals for implementing it? Some kind of new character class syntax
> maybe? A pseudo POSIX style character class maybe?
>
> Yves
>
>
>
> --
> perl -Mre=debug -e "/just|another|perl|hacker/"

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About