On Tue, Sep 27, 2011 at 05:09:33PM -0600, Karl Williamson wrote: > My understanding is that the the original reason for not doing the input > checks was performance. Security is a far more important issue now, and > Nicholas has demonstrated code that does the parsing with a minimal > performance hit. I had hoped to work on it over last Christmas, but everyone got ill and my laptop power supply failed. So it didn't happen. Whilst I have a feel for how to do it for UTF-8, I have no idea how do to it for UTF-8 and UTF-EBCDIC, or at least "not break EBCDIC platforms" or "make something hard to port to EBCDIC" as a side effect. I also wasn't sure how to benchmark it properly, to be confident about the magnitude of the performance change. I had thought that my test code should be *more* efficient that the current code in utf8.c [it did less work], but all the numbers I could collect showed it to be slightly slower. Hence why I'm not trusting my intuition about what's happening. It's also blocking on lack of feedback to bug #79960 Nicholas ClarkThread Previous | Thread Next