On Fri, Mar 30, 2007 at 08:00:36PM +0200, Marc Lehmann wrote: > On Fri, Mar 30, 2007 at 01:31:22PM +0100, Nicholas Clark <nick@ccl4.org> wrote: > However, some of the obvious fixes would be to change ExtUtils/typemap so > that stuff such as "const char *" does no longer boil down to random bytes. > Example: > > SV *compress (const char *data); > > the right thing here is to use SvPVbyte, at leats in the majority of > cases. The reason is that existing users either have to clal downgrade > explicitly themselves or suffer from random problems. This seems a sane idea. However, I'm not going to change it for 5.8.9 5.10 is a different matter, but also not my call. > Could you tell me why almost every other 5.6 bug was fixed in 5.8, but > gratitious breakage of large parts of CPAN are accepted with this change? > Whats the rationale behind keeping this 5.6 bug, while fixing the rest? No, I can't. 5.8.0 and 5.8.1 were not my releases, *and* I wasn't aware that 'C' was a problem at that time. I *think* that the reason may have been because "it is documented in Programming Perl" that it behaves the 5.6.0 way. *but* I went looking, and the closest I can find to an assertion about how it works is: * the pack/unpack letters "c" and "C" do /not/ change, since they're often used for byte-orientated formats. (Again, think "char" in the C language.) However, there is a new "U" specifier that will convert between UTF-8 characters an integers: pack("U*", 1, 20 ,300, 4000) eq v1.20.300.4000 * The chr and ord functions work on characters chr(1).chr(20).chr(300).chr(4000) eq v1.20.3000.4000 In other words, chr and ord are like pack("U") and unpack("U"), not like pack("C") and unpack("C"). In fact, the latter two are how you now emulate byte-orientated chr and ord if you're too lazy to use bytes. [3rd edition, page 408] > > I don't like anything Perl space that lets the abstraction leak, and "C" is > > one of them. > > So why not fix it? Nobody made such a fuss when they fixed the remaining bugs > from 5.6. For example, PApp, one of my older modules using unicode, is full I'm not going to change anything this late in 5.8.x. Whether 5.10 changes is not something I have the final say on. > And as I said, there is no pack-type that gives me the old meaning of > "C" that every structure-decoding program relies on. Thats gratitious > undocumented breakage. (It really is undocumented because all of the perl > documentation tells me that the internal encoding doesn't surface, and the > small hint in the pack description for "C" seems to reinforce this as it > tells me it works "even in the presence of Unicode"!). > > In any case, please could you answer to me why you accept obvious breakage > of old code in this case? I really wanna know. > > The only argument in favour I have heard os far is that the camelbook > documents it in some obscure way. But that cannot be a reason to keep a > bug. If the camelbook describes buggy behaviour, it needs a fix. It is > insane to force every existing perl program that uses that feature to > be changed in a way that contradicts the rest of the documentation, is > unintuitive and generaly useless (again, show me a useful application for > unpack "C" with 5.8 semantics). I agree with the obscure now. Reading the wording of the Camel book carefully, this behaviour $ perl5.00503 -le 'print unpack "c", chr (256+78)' 78 $ perl5.00503 -le 'print unpack "C", chr (256+78)' 78 "unchanged" actually means to me that it would produce the same output. The only thing that seems to define the current 5.6 behaviour is the comparison of unpack("C") with ord under use bytes in the paragraph on chr and ord. Nicholas ClarkThread Previous | Thread Next