On Mon, May 19, 2008 at 01:34:13PM -0700, Glenn Linderman <perl@NevCal.com> wrote: > The gist of the problem here is that > > 1) The "automatic" conversion of 8-bit to UTF-8 "assumed" Latin1 because > it was (a) easy numerically (b) worked well on platforms that use Latin1 > as their native encoding. Which platform is that? I really don't know *any* such platform. Note also that the automatic conversion in perl doesn't assume any encoding *at all*, so this is simply not true. > 2) Windows assumes ANSI code page for 8-bit data, but Perl on Windows, > for quite a few releases now, has not... instead, it "assumes" Latin1 > when "automatically" converting 8-bit to UTF-8. This is not what happens. Perl simply does not assume any encoding. If you have an 8-bit filename encoded in latin1 then perl doesn't treat it any different than an 8-bit filename encoded in koi8-r (another "ANSI" encoding). upgrading and downgrading doesn't change that, or at least shouldn't change that. where it does, it affects unix as much as any other platform. > Retrofitting Perl on Windows to assume 8-bit data is ANSI will break all > code that attempts to work with the constraints of 1 and 2. This would probably be true if 1) and 2) were real, but they are not. > somewhat lower performance than assuming Latin1. And it would possibly > have prevented, by example of a widely-used platform, the assumption > throughout lots of Perl code, that all 8-bit data is assumed to be > Latin1 implicitly. Perl doesn't do that anywhere on any platform, to my knowledge. Make an example of a platform that expects filenames as latin1. (you can select this under unix, yes, but you can do so under windows as well). (the rest of the mail is either true, or depends on these critical but wrong assumptions. It is still use that decodes encoding). -- The choice of a Deliantra, the free code+content MORPG -----==- _GNU_ http://www.deliantra.net ----==-- _ generation ---==---(_)__ __ ____ __ Marc Lehmann --==---/ / _ \/ // /\ \/ / pcg@goof.com -=====/_/_//_/\_,_/ /_/\_\Thread Previous | Thread Next