On 04/29/2011 06:13 AM, Nicholas Clark wrote: > and that what happens is that to [^\xDF] is processed all in one, not as a > sequence: > > a range > an inverted range > in a case insensitive match > > > so it's not implemented as a human*might* think, in terms of > > * process the ranges inside the [^...] construction to make a list of code > points (in my case that's one code point, U+00DF) > * [^...] means invert the list (in my case, that's several million code points) > * now match the inverted list against the input string > * oh yes, do that insensitively > The crux is your "oh yes, do that insensitively". The word "that" means the previous step has to be modified. The way it currently works for cases like this is that it creates the union of the characters not to match and their folds, plus a flag that says complement the result at execution time, which means the list is essentially all the non-matches. A single 's' is not in the list of non-matches, but 'ss' is. George is right, which wins?Thread Previous | Thread Next