* Eric Brine <ikegami@adaelis.com> [2011-09-19 03:20]: > File names are meant to be read as text, so one can't really claim > they're just octet sequences. So the real question is what should we > do when readdir encounters a file name that doesn't cleanly decode > using the encoding it's expected to be encoded with (e.g. a file name > that's not valid UTF-8 on a box with a UTF-8 locale). One could take a page from Python here and use its surrogate escape error handling. There was a subthread about it a while ago: http://www.nntp.perl.org/group/perl.perl5.porters/;msgid=A8767ACF-E6A0-498A-B402-54A12D26523B@activestate.com What this approach effectively does is allow strings to unambiguously represent a mixture of bytes and characters, which in a roundabout way essentially solves the problem that Perl only has a single string type. But do note the later message about the security implications. It will take some thought to get this clean, but there is a lot of potential in it. I love the idea and it is one of my todos to add this to Encode should no one else get there first. The core could then use this method to provide clean and nice interfaces to any OS APIs which are textual in intent but binary in practice – as Python does. It would be a major step forward for Perl. Regards, -- Aristotle Pagaltzis // <http://plasmasturm.org/>Thread Previous