In thinking about [Perl #116148]: "Pattern utf8ness sticks around globally", it seems to me that the regex engine should do some self-protection. It seems to me that the target string should be checked for proper UTF8-ness upon entry. That way we don't have to worry about testing for such things in the middle, when backtracking could cause the same test to be done gazillions of times. I added an assert() to do this, and the test suite hangs; there are a number of assertion failures as well. I haven't debugged anything yet. But I'm thinking that this should probably be not an assertion, but something that is done in production code to guard the engine from reading off the end of buffer, etc. I would think that the right thing to do would be to raise a warning and fail the match when bad input UTF-8 is encountered. However, this bug report is where the pattern isn't valid UTF-8 (it isn't UTF-8 at all; the engine just thinks it is). Hopefully, we have enough control over regex compilation that we generate only valid UTF-8, but this bug indicates that could fail. I am proposing adding a -DDEBUGGING-only check in to the regex engine to, at the start of each match, go through the pattern, and check each text node for valid UTF-8. I am presuming that this would have caught this bug before release, and production code would not be slowed down