On 04/30/2011 01:12 PM, Tom Christiansen wrote: > Isn't 0xDF and "SS" *the* big problem? I don't think the others are > troublesome, are they? What about not generating multichar folds in > charclasses that contain nothing over 255? Or would that be resurrecting > the Unicode Bug? > > --tom > I think you're right that all or nearly all existing code that's going to get broken will be over ß and ss. The Unicode Bug is about utf8 vs non-utf8 encoding having different semantics, so no, this wouldn't be resurrecting it. But it is kind of like the Unicode bug, where addition of a new character to the class would suddenly change the behavior of the class for non-obvious and not really related reasons. I would prefer a more uniform approach of what I've said before, or we just exclude this one code point always for 5.14. But I think your approach is much better than releasing 5.14 as-is. (And BTW, in 5.16 I think it would be something like "use re folding X" where X is one of "simple" "full" "nfd", nfkd, etc.)Thread Previous | Thread Next