Based on feedback, here's a revised proposal: No layer allows in syntactically malformed utf8 :strict_utf8 allows in only what Unicode says is interchangeable :safe_utf8 (or maybe :portable_utf8) allows the above plus above-unicode code points up to those that begin with 0xfe. It's said that 0xfe and 0xff can start looking like utf16, although I don't fully understand the whole thing. If we accepted 0xfe and not 0xff we still wouldn't ever accept a misconstrued BOM; accepting 0xfe goes beyond what a U32 can hold, and so is non-portable. Another possibility is for this option to accept only up to what a U32 can hold. :unsafe_utf8 (or :non_portable_utf8) allows in surrogates, noncharacter code points, and all above-unicode code points that don't overflow the platform's UV. :utf8 is aliased to :safe_utf8. I'm with zefram that the easiest thing to do should not allow attack possibilities. :no_surrogates prohibits surrogates :no_above_unicode prohibits above-unicode code points :no_nonchars prohibits non-character code points. I believe this gives the orthogonality that xdg wants; better name suggestions welcomeThread Next