A Unicode non-character is one of 66 ordinals (code points) reserved by Unicode to never be a character. A non Unicode character is a code point that is above 0x10FFFF, and Unicode says they will never ever use those. They are outside, and always will be, the Unicode standard. For more clarity, I will call them non-Unicode code points. A non-character is illegal for interchange, but one is free to use it internally in an application. Note that an application can be any number of cooperating processes, so that these code points are usable in I/O. Unicode doesn't like anyone using a non-Unicode code point, but Perl accepts them. The problem it seems to me is that that Perl treats these two classes of code point differently, and I am trying to reconcile that behavior. It seems to me that they should have rough parity. When one is converting from code point to utf8, there is parity. Use of either of these will raise a warning, but there is a flag for each that turns off the corresponding warning. The difference comes when trying to go from utf8 to ordinal. The non-Unicode code points are accepted unconditionally, without any warnings ever. The non-character code points are treated as malformed utf8, and unless the flag is set to allow them, will cause Perl to throw up its hands. This just seems wrong to me. Neither is more malformed than the other. Neither should be used for interchange with unsuspecting applications, but both should be usable within a set of cooperating applications. Yet Perl treats worst the ones that Unicode likes the best. I suspect that this behavior stems from early Unicode documentation which called the non-characters "illegal characters", but that is not what they mean now. (Their provenance is based on big-endian vs little-endian potential confusions, and trying to make it a little easier to process Unicode in 16-bit word chunks.) I'm curious if anyone has some ideas on this