On 2005-04-11 15:40, "gcomnz" <gcomnz@gmail.com> wrote: > "¤é¥»»y".chars would return <¤é¡@¥»¡@»y>, which can probably be expressed with UTF8? The string "¤é¥»»y" is probably represented internally as UTF-8, but that should have no effect on what .chars returns, which should, indeed, be <¤é¡@ ¥»¡@»y>, that is, an array whose elements are strings which each represent one Unicode code point ¡V irrespective of encoding. I think that, in general, at the level of Perl code, 1 ¡§character¡¨ should be one code point, and any higher-level support for combining and splitting should be outside the core, in Unicode::Whatever.Thread Previous | Thread Next