develooper Front page | perl.perl5.porters | Postings from July 2016

RFC: seeking syntax for allowing script run pattern matching

Thread Next
Karl Williamson
July 6, 2016 19:45
RFC: seeking syntax for allowing script run pattern matching
Message ID:
A script run is a sequence of characters, all from the same script, such 
as Latin or Greek.  In applications that need to care about security, 
they are important so that someone can't, say substitute a look-alike 
cyrillic letter for a latin one,  'scope' looks pretty much identical in 
Macedonian Cyrillic as it does in English Latin.  'paypal' is the most 
famous case, as all but the 'l' have look-alikes in cyrillic and latin.

It would be good to allow a program to specify that they want a script 
run so as to automatically avoid such security holes.

It only makes sense if the match can be multiple input characters. 
Therefore, I thought modifying the quantifier to indicate a run would be 


are currently a syntax errors, and so using '*' after the modifier would 
be a candidate.  But Lukas pointed out that many times a quantifier 
could never mean a script run.  What would


mean?  If the '*' is ignored in such a case, should it warn?

We could use something like {sr} (standing for "script run" ) instead of *.


But this has the same problem.  Or we could have it be this:


But then \w{sr} without a quantifier doesn't mean anything.  Or Andrew 
Rodland suggested


I'm looking for some more ideas.

Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About