Gisle Aas writes: : This patch relative to 5.5.650 makes perl do the right thing for : literals containing hibit charactets. The follwing behaviour will : change if you apply this patch: : : - a \x{} escape will not force the UTF8 flag on, unless the value : is acutally higher than \xFF. Good. : - the "\xff will produce malformed UTF-8 character; use \x{ff}" : warning is gone, since we now always do the right thing :-) Good. : - under 'use utf8', hibit chars that are illegal utf8 are encoded : using utf8; basically automatically turns latin1 into utf8. : This ensure that there will never be illegal UTF8 sequences in : a literal string that has the UTF8 flag set. I know I originally put in the comment, "could cvt latin-1 to utf8 here", but I'm currently thinking that if a file has utf8 mixed with latin-1, it's probably already in serious trouble by the time it gets to the latin-1, so it probably better croak. Especially if the filehandle was implicitly put into utf8 mode by thinking it saw utf8 earlier, when in fact it only saw bizarre latin-1. The better approach is to make them go back and insert "use charset 'latin-1'" or some such at the beginning. : - Octal escapes like \400 and \777 will actually do the right thing now. : Previously you only got the low 8-bits. Hmm. An argument could be made that those should be illegal, though I don't know that I want to make it. : But, it still looks like the \N{} support will not work as it is : now. It never sets the UTF8 flag on the string by itself. Well, it should resolve to a character that's either above \xFF or not, so it seems conceptually simple. But I have to confess to not understanding the \N code at all: print "\N{WHITE SMILING FACE}"; produces constant(\N{...}): %^H is not localized at - line 2, within string Talk about obscure error messages! I think it means that \N will need to be taught about pulling in the Unicode names by default. Previously, I think it assumed the Unicode names would come in with a "use utf8", but that's going away, so we need to make it the default if \N doesn't otherwise recognize its name, I imagine. But thanks! It's easy to sit on the sidelines and carp, but we need more real code whackers like you. Larry