On Mar 30, 2007, at 2:25 PM, Juerd Waalboer wrote: >> That so many users, including those as expert as Marc, possess a >> "broken" understanding of Perl's Unicode model suggests a flawed >> design. > > I think the design is solid, but the implementation (see regex) > slightly > broken and documentation wildly misleading. I strongly disagree with this assessment. In particular, I think insisting that the user be responsible for manually segregating character and byte-oriented data without any help from Perl is totally unreasonable. Look at how easily Marc made the "mistake" of commingling the two types of data. It's debatable whether the fact that Perl allowed him to do that without complaint is a flaw with the design or the implementation, but it's one or the other and it's serious. Additionally, as Marc points out, there are lots of broken XS modules out there -- including one of mine. (KinoSearch 0.15 -- Unicode support is fixed as of 0.20_01, which breaks backwards compatibility.) Few or none of them would be broken if Perl made it more difficult to move between character data and byte-oriented data -- errors would be flying right and left and the broken modules would get fixed right away. Of course I understand why that cannot be the case, but it's astonishing to me that you see this as a problem which can be solved via documentation. I hope that Perl 6 does not opt to replicate Perl 5's behavior in this area (my understanding is that it will not, but I'm not following development closely). I hope that project is taking into account the lessons we have learned in the wake of very difficult compromises about how to balance the addition of Unicode with preserving backwards compatibility. > Surely you must know a way in which Perl's unicode support can be > improved, or accidents avoided, without trying to change all of Perl, > CPAN, and a gazillion lines of code that we can't even reach. Let's > hear > it! :) How about encouraging the use of encoding::warnings in perlunitut? How about adding it to core and having 'use 5.10;' turn it on? Marvin Humphrey Rectangular Research http://www.rectangular.com/Thread Previous | Thread Next