develooper Front page | perl.perl5.porters | Postings from December 2014

regnodes for won't match unless target string is UTF-8?

Thread Next
From:
Karl Williamson
Date:
December 26, 2014 23:22
Subject:
regnodes for won't match unless target string is UTF-8?
Message ID:
549DEDB3.7000006@khwilliamson.com
Consider

qr/foo...bar\x{100}/

This cannot match a target string that isn't in UTF-8.  We can determine 
this at pattern compilation time.  We could introduce new EXACTish nodes 
that indicate that no match is possible for that node if the target 
isn't UTF-8.

An attempted match of such a node could be rejected immediately if at 
execution time the target string isn't UTF-8 instead of having to test 
each byte in the node against the corresponding byte in the target until 
we come to the failing one (potentially having to fold them).

This would be a big win if cases like this are common in the real world. 
  Otherwise, it's probably not worth it.

This is a pretty trivial change, and we are not running out of available 
node-types.  I'm unsure of its potential value.

Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About