On 10/18/2017 06:26 PM, Tony Cook wrote: > On Wed, Oct 18, 2017 at 11:32:50AM -0600, Karl Williamson wrote: >> Here is my updated proposal for this, fleshing out what was agreed to at the >> core hackathon, with other things that have been discussed over the years. >> See http://nntp.perl.org/group/perl.perl5.porters/220508 and its thread for >> background. >> >> I propose to have >> >> (?{script_run}:...) >> >> mean to match the subpattern indicated by "...", but impose the additional >> constraint that all characters matched must be in the same Unicode script as >> the first character in the matched sequence is. Certain characters like ':" >> and "." and combining accents would be considered to be in every script. >> >> This prevents mixed-script attacks like the famous one of a link containing >> >> paypal.com >> >> where the characters before the 'l' aren't Latin, but Cyrillic, and clicking >> on that link would lead to a malicious page. >> >> (?{script_run}:\d)+ >> >> would match any sequence of decimal digits, but all would have to come from >> the same script. > > This looks interesting and useful. > >> Abigail has the use where he wants to match any number between 0 and 255, >> and only those, but in any script, and all from the same script. That's >> doable in this proposal; something like: >> >> qr/ ( \b >> >> (?{script_run}:[ \p(nv=0} - \p{nv=9} ]) >> |(?{script_run}:[ \p(nv=1} - \p{nv=9} ] [ \p(nv=0} - \p{nv=9} ]) >> |(?{script_run}:[ \p(nv=1} - \p{nv=2} ] [ \p(nv=0} - \p{nv=9} ]{1,2}) >> >> ) \b /xx; > > I do wonder how these ranges would be implemented. > > Perhaps something like: > > \p{nv=0-9} > > would be simpler to implement and more concise. I realized after I sent this that this wouldn't work at all; I wasn't really thinking it through. One would currently have to specify each nv value separately, and not as a range. The highest nv value in the third line should be 5, not 9 so as to now allow above 255 > Your ranges already have a meaning, though not an especially useful or > understandable one: > > $ ./perl -Ilib -le 'print "1" =~ /[\p{nv=0}-\p{nv=9}]/ ? "match" : "no match"' > no match It warns if turned on, and was a crazy thing for me to have said. > > Tony >Thread Previous | Thread Next