I have basically fixed the failing new tests. I have not uploaded the fix because these fixes break other tests, and so far I havent been able to review them all for correctness and fix them or their underlying breakage. Also, I start to doubt the tenability of this path. Making \d mean [0-9] seems to me to be clear (please speak up if anyone disagree). Making \w have the strict [A-Za-z0-9_] behavior by default is looking less sensible than it seemed at first. (Mea culpa) Unfortunately having it always mean its unicode interpretation seem prima-facia untenable as well, and leaving it as is leads to IMO irresolvable logical contradictions in the regex engine. Similar problem with \s. So this means we have to do this the hard way. That is, we are going to have to introduce new regex modifiers to control the syntax and continue to at least support the inconsistent semantics, if not also continue to default to the broken semantics :-(. We currently have one bit to control whether regexes are compiled under locale. This effectively means that when the bit is off that it means we get the current "broken" semantics, and that if we add another bit we get a four way switch that can be controlled by modifers, and we can set up a pragma to control the default semantics. Possibly make it so use perl 5.12 changes the default. 00 - legacy utf8/perl semantics (possibly inconsistent) use re 'legacy'; no locale; possible modifier: /B (for b0rked) 01 - regex compiled under use locale: use locale; use re 'locale'; possible modifier /L 10 - unicode semantics: use re 'unicode'; possible modifier /U 11 - ascii/perl semantics: use re 'ascii'; possible modifier /A This has a lot of run on consequences tho. Charclass structs have to in the worst case be made larger by 4 bytes. Some reorganization of the pmop->flags field and the re->extflags field, new opcodes, and more code in the regex engine. The bad news is that doing all the above is a reasonable amount of work and probably not going to happen in time for the next release. The good news is obviously that the backwards compatibility problems would be much lower. What i will do however is push my changes as a branch, so people can see what perl looks like with just this bug fixed and the restrictive "ascii-perl" semantics imposed on \w and \s and a few of the tests changed in trivial ways to pass with the new semantics. cheers, Yves -- perl -Mre=debug -e "/just|another|perl|hacker/"Thread Next