On 08/15/2017 02:01 PM, Bo Lindbergh wrote: > Quoth Karl Williamson: >> I concede that there are encodings that do use the 80-9F range, and these could be wrongly guessed. The most likely one still in common use is CP 1252. I did try once to create a string that made sense in both encodings, and I did succeed, but it was quite hard for me to do, and was very short; much shorter than an error message. > > Actual, non-synthetic example: > https://en.wikipedia.org/wiki/Muvrar%C3%A1%C5%A1%C5%A1a > > The name "Muvrarášša" can be encoded in Windows-1252 as the octets > (hex) 4D 75 76 72 61 72 E1 9A 9A 61 > which is also the correct UTF-8 encoding of the string "Muvrarᚚa", > where the next-to-last character is U+169A OGHAM LETTER PEITH. > > > /Bo Lindbergh > I'm curious how you found this? (This particular example could be solved by realizing that Ogham is not a script likely to be represented in 1252.)Thread Previous | Thread Next