Unicode properties are only defined for code points 0-0x10FFFF, yet Perl allows the expression of code points up to UV_MAX, a much larger number. That means that the Unicode behavior for the remainder is undefined. We have chosen to make that behavior be as described in the warning. I believe it was me who came up with the warning, and that it was introduced in Perl for 5.16. (And I don't have the tuits/energy to fully research it right now, as I believe that is tangential anyway to this post.) Most code will never deal with such large code points, and hence will never encounter the warning. But if it does happen, the warning will likely be displayed many times, even in the same regex match when backtracking occurs over the large code point(s). There are also bugs in the implementation. It was quite wrong for Perl v5.16, as Tom Christiansen discovered; largely fixed for v5.18, but the warning still doesn't get displayed if the regex node that contains the \p{} or \P{} is optimized into something besides the normal one; and it can be displayed twice for the same code point even if there is no backtracking, the first time for the regex optimizer's synthetic start class, before regular matching begins. I have been thinking of what to do. One potential solution is to make this a once-only display (per thread) message. That means that its display would set a per-interpreter variable that would cause it to never display again on the current thread. But then it occurred to me. Is this message really necessary? Perhaps we should just get rid of it altogether, and make sure the pod documentation is very clear about this possibility for the very rare program that is affected. Thoughts?Thread Next