-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Moin, On Friday 30 March 2007 23:06:47 Marvin Humphrey wrote: > On Mar 30, 2007, at 2:25 PM, Juerd Waalboer wrote: > >> That so many users, including those as expert as Marc, possess a > >> "broken" understanding of Perl's Unicode model suggests a flawed > >> design. > > I think the design is solid, but the implementation (see regex) > > slightly > > broken and documentation wildly misleading. > > I strongly disagree with this assessment. In particular, I think > insisting that the user be responsible for manually segregating > character and byte-oriented data without any help from Perl is > totally unreasonable. > > Look at how easily Marc made the "mistake" of commingling the two > types of data. It's debatable whether the fact that Perl allowed him > to do that without complaint is a flaw with the design or the > implementation, but it's one or the other and it's serious. > > Additionally, as Marc points out, there are lots of broken XS modules > out there -- including one of mine. (KinoSearch 0.15 -- Unicode > support is fixed as of 0.20_01, which breaks backwards > compatibility.) Few or none of them would be broken if Perl made it > more difficult to move between character data and byte-oriented data > -- errors would be flying right and left and the broken modules would > get fixed right away. > > Of course I understand why that cannot be the case, but it's > astonishing to me that you see this as a problem which can be solved > via documentation. I think just documenting isn't enough. We do have things like "strict", so if the current Perl model doesn't allow you to even detect when you mix the wrong kind of data, then we need module/pragma that catches these errors. Of course warnings::encode exists, but it seems to not be able to distinguish between "untagged" data and real ISO-8859-1 strings as Perl itself doesn't make this distinction. > How about encouraging the use of encoding::warnings in perlunitut? > > How about adding it to core and having 'use 5.10;' turn it on? If I understand correctly, that would not be enough due to the "is this binary or really iso-8859-1 encoded data" problem mentioned above. all the best, tels - -- Signed on Sat Mar 31 01:42:47 2007 with key 0x93B84C15. View my photo gallery: http://bloodgate.com/photos PGP key on http://bloodgate.com/tels.asc or per email. "In 1988, Jack Thompson ran against Janet Reno for DA of Dade County: Thompson's unique campaign message was that Reno was unfit for the job because, as a closeted lesbian with a drinking problem, she was great candidate for blackmail by the criminal element. Jack never explained why this remained a threat even after he exposed her 'secret'. Reno cruised at the polls." -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (GNU/Linux) iQEVAwUBRg29jncLPEOTuEwVAQJALAf/SsSjz5VB4l3Zcggd18SNmdTq8DpBLUtP pxiPCs0fYrEtDny/HvDCbQss/nEaGmFwPaVpAA+kFp8jss3h3xzklW6MwAm7Aisy +EiZO0JEcADXRWr9CChJpWfMr0qllmzsUUKHa6wc9iXagD6kPoiL49Ay5bkqPBDT OKOfcJIRDqk12VKATpdQlBIHR3cEpnUMdh8QKhmAArkXAsV5cZGBC9EGm8l+dgeK Uc2k7pxvLXdjCZu6YbJfPwwdiLlugL23Bci7sZrCO/JyboBOK3ch5dWYohZ8QoMw SahL/axgJ1DeFTP2ryL6wvnM1djF+HSbzoaLD1E+d7XJqB700Qxdfg== =eI9w -----END PGP SIGNATURE-----Thread Previous | Thread Next