Dr.Ruud skribis 2007-02-05 23:09 (+0100): > perl -wle ' > $s = substr "\x{100}\xFF", 1; > print length $s, ":", unpack "H*", $s; > ' Note that normally, unpack "H*" on a unicode string (like your $s) is a violation of proper separation. Of course, it's warranted when you really want to demonstrate the internals like you have now. I'm just commenting for clarity. My preferred way to show the internals is Devel::Peek::Dump: use Devel::Peek; $s = substr "\x{100}\xFF", 1; Dump $s; Output: SV = PV(0x8149ae8) at 0x8149624 REFCNT = 1 FLAGS = (POK,pPOK,UTF8) PV = 0x816a920 "\303\277"\0 [UTF8 "\x{ff}"] CUR = 2 LEN = 4 This includes the individual bytes of the internal byte string: \303, \277 (and \0); the characters in the Unicode string: \x{ff}. Note that CUR is the length *in bytes*. After all, we're dumping the *internal* values. The length in characters isn't calculated until it's needed, for better performance. When it has been calculated, it is cached: # after length($s) SV = PVMG(0x8163bb0) at 0x8149624 REFCNT = 1 FLAGS = (SMG,POK,pPOK,UTF8) IV = 0 NV = 0 PV = 0x816a920 "\303\277"\0 [UTF8 "\x{ff}"] CUR = 2 LEN = 4 MAGIC = 0x816a8a0 MG_VIRTUAL = &PL_vtbl_utf8 MG_TYPE = PERL_MAGIC_utf8(w) MG_LEN = 1 MG_LEN is the length in characters. -- korajn salutojn, juerd waalboer: perl hacker <juerd@juerd.nl> <http://juerd.nl/sig> convolution: ict solutions and consultancy <sales@convolution.nl> Ik vertrouw stemcomputers niet. Zie <http://www.wijvertrouwenstemcomputersniet.nl/>.