From: Zefram [mailto:zefram@fysh.org] > >Am I correct in assuming that I can't automatically (and safely) > >determine whether the $name parameter is a candidate for storing as UTF8? > > In principle anything is a candidate for storing as UTF-8. If your choice is between storing as Latin-1 and storing as UTF-8, a more sensible question is whether you *need* to store as UTF-8 > (because the string can't be represented as Latin-1). You can test this with /[^\x00-\xff]/. Correct - I only want to bother with encoding if it is actually needed - UTF8 support in Zip files is a relatively recent addition and it does mean extra bloat to the file created. I don't want to do it if it isn't necessary. > You might alternatively decide to UTF-8-encode anything that's not pure ASCII, which you can similarly test with /[^\x00-\x7f]/. The /[^\x00-\xff]/ suggestion sounds like it should be a sure-fire way to say that the string does contain utf8, but the opposite obviously isn't going to be true. That means I'm going to have to add an option to allow the user to explicitly flag that the filename should be encoded in utf8 before it gets written to the file. PaulThread Previous | Thread Next