On Mon, Feb 19, 2001 at 07:14:07PM -0600, Jarkko Hietaniemi wrote: > Protocols: if all I know is that my output is 500 Unicode characters > long, how am I to print out Content-Length? As I said to abigail, I would love a concrete explanation of what you have in mind. In particular, what is your mechanism for ensuring that perl is representing $output as utf8? Let me show you what I would fancy (modulo syntax, which I haven't been following): $eh = new EncodingHandler 'UTF-8'; $out = new IO::Socket { output_discipline => $eh->output_discipline, ... }; print $out "Content-length: " . $eh->length($output); print $out $output; Let me also (horror of horrors[1]) tell you what you would probably do in Java: OutputStream o; String output; byte[] output_bytes = output.getBytes("UTF-8"); String header = "Content-length: " + output_bytes.length + "\n\n"; o.write(header.get_bytes("UTF-8"); o.write(output_bytes); (Note my imagined Perl interface didn't require converting the whole string to utf8 at once.) > If I have a scalar which according to length() is 10E7 Unicode characters, > will it fit within my disk quota of which I have 20E7 bytes left? Again, it depends on the output discipline you will use to get it on disk, and thus should be part of whatever library you use for output disciplines. Why do you think it should be otherwise? > Any encoding which hasn't yet been encoded in Encode? In that case, how did it ever get internally represented as utf8? I would expect in this case that bytes of the string would end up as Perl characters, just like with non-Unicode perl. Andrew [1] Contrary to what you might guess, I mean that.