2008/5/21 Juerd Waalboer <juerd@convolution.nl>: >> Does your parenthetical remark mean that it should be done by detecting >> that the column is SQL BINARY or VARBINARY vs SQL CHAR or VARCHAR? > > Yes. And how the DB stores the data: in which encoding. Never had problems DBI returning double-encoded data ? >> Unless utf8::is_utf8() is removed (which would break some amount of >> compatibility), that is all the knowledge required. > > Yes please. Almost every single use of is_utf8 is either for debugging > (and can thus be removed without affecting semantics) or JUST PLAIN > WRONG. Removing (or renaming) it will be immensely helpful. Marking as [INTERNAL] like utf8::valid is OK. Removing it is not. I can also think of another legitimate use for is_utf8 : testing (specifically, testing that scalars returned by some XS modules are correctly set.) (But of course you could do that testing with Devel::Peek -- only more painfully.) > It's adding one bit, and checking one bit. I assume that this can't > possibly hurt performance. Adding one bit can hurt if you don't have one bit left in your bytes. >> Certainly it is more convenient to be told the exact line where the >> stricture would be violated > > Line and operations. I'm thinking of a warning like "Binary-incompatible > string added to binary string as UTF-8 data in assignment at foo.pl line > 5". Sounds mostly good. A bit verbose for my taste. -- 'Do you know what they called a sausage-in-a-bun in Quirm?' said Mr Pin, as the two walked away. --- 'No?' said Mr Tulip. --- 'They called it "le sausage-in-le-bun".' -- Terry Pratchett, The TruthThread Previous | Thread Next