demerphq skribis 2007-04-24 11:37 (+0200): > One would assume that unicode semantics would be obeyed when either > the string or pattern was unicode, and that latin1 semantics (for lack > of a better term) would be followed only when neither were unicode. If I didn't know Perl, I would assume that it would always use Unicode semantics, or never, because I read somewhere that Perl only has one string type. > The problem is that the optimiser thinks that /\xDF/i under unicode is > really 'ss' and therefore that the minimum length string that can > match is 2. Ouch. > At this point the only solution I can think of is to disable minlen > checks when a character is encountered that folds to a multi-character > string. I think correctness is more important than performance, especially when it is needed for real world languages like German. -- korajn salutojn, juerd waalboer: perl hacker <juerd@juerd.nl> <http://juerd.nl/sig> convolution: ict solutions and consultancy <sales@convolution.nl>Thread Previous | Thread Next