Just as a data point, the current regex engine does a block memory comparison for an exact string if: * the string and pattern have the same UTF8-ness, and * the match is case-sensitive but does character by character matching otherwise; ie fast: "X" =~ /XYZ/; "\x{100}" =~ /\x{100}\x{101}\x{102}/; slow: "\x{100}" =~ /XYZ/; "X" =~ /\x{100}\x{101}\x{102}/; "anything" =~ /anything/i; (Arguably a patten should store both plain and utf8 versions of each exact string for quicker matching.) -- A walk of a thousand miles begins with a single step... then continues for another 1,999,999 or so.