On approximately 5/21/2008 8:56 AM, came the following characters from the keyboard of demerphq: > 2008/5/21 Glenn Linderman <perl@nevcal.com>: >> On approximately 5/21/2008 1:29 AM, came the following characters from the >> keyboard of Rafael Garcia-Suarez: >> >>> Some way to mark PVs as "binary" and not upgradeable to SvUTF8 would be >>> handy, though. >> >> What's the goal? >> >> If, during the lifetime of a binary string, data gets attached to it that >> makes it get upgraded, and later that data is detached, and the storage >> format is truly transparent, then when the string is used in a context that >> needs bytes, it should be handled properly (if not, let's fix that bug), >> either by downgrading, or by accessing the data and validating that the >> values are each < 256 (which downgrading does as a side effect). > > So how would that work exactly? Seriously. Give a general framework > about how it would work. Consider that if it makes things massively > slower that its probably not going to fly. So where are the places where perl string operations need bytes? O (of I/O) springs to mind. Module Encode::Decode springs to mind. O already warns if a data item contains values > 255. Not sure how that is implemented, but since it already does it, it seems there is little added cost. Decode is already looking at data byte-by-byte. Changing that to character by character, and bounds checking the values doesn't seem like a huge added cost. Others? >> If the goal is to prevent the cost of upgrading and downgrading, well, just >> fix the bug that attached the upgraded data... and the cost of doing so also >> vanishes. > > I dont think its so easy. The code responsible may be very hard to identify. Especially when the storage format is truly transparent, the responsible code may be very hard to identify. I don't think, though, that it would be necessary to remove utf8::is_utf8, but switching it to diagnostic only, would allow code to be instrumented to discover where data is upgraded... via binary search instrumentation, breadth first tree searching, etc. This would allow it to be tracked down when necessary. The thing is, if your string _is_ byte-oriented, any operation that upgrades it truly is a bug, and it should be tracked down. So if it is cheap to add a flag to prevent upgrades, and produce errors or warnings at the point of upgrade attempts, maybe that is OK, but correct code wouldn't need the checks, as far as I can see. So that's why I wondered what the goal was... > Yvesb -- Glenn -- http://nevcal.com/ =========================== A protocol is complete when there is nothing left to remove. -- Stuart Cheshire, Apple Computer, regarding Zero Configuration NetworkingThread Previous | Thread Next