I've been working on adding the regex modifiers for unicode/locale/traditional semantics; and am almost ready. However, the way I've implemented it breaks backward compatibility with things that rely on the current stringification of regexes. I had to fix several .t's that did this (including one in cpan). I have no idea how prevalent this reliance is. The reason is that the code always outputs the modifier, even for the the existing semantics, so the stringification is permanently different from before. That got me to thinking that it should be possible to omit the modifier from the stringification unless it is different from the default. Thus backward compatibility would not be broken, although it is a dangerous thing for modules to be relying on knowing all the possible modifiers, and someone could pass them a regex compiled with a new modifier that they don't know how to deal with, not just these, but any new ones. But there must be some reason that xism are always shown in the stringification, whether they are in effect or not in effect. But I can't think what it might be. It would have been simpler for the original design to only output them in the stringification when different from the default. So there must be a reason why the plus or minus of them is always output. Does that reason apply to these new modifiers? Perhaps an example will clarify things. Currently if you say qr/foo/, the stringification is (?-xism:foo) If you instead say qr/foo/xm, the stringification is (?mx-is:foo). My working plan is to have modifiers Cl for locale, Cu for unicode, and Cd for dual. 'C' stands for character set. I've also toyed with 'S' for semantics. Anyway, dual is because it behaves sometimes like the native character set, and sometimes like unicode. But anyway, the new stringification would be (?Cd-xism:foo) or (?Cdmx-is:foo). Is there a reason that the Cd needs to be output? If not, why do the 'xism' always have to be output?Thread Next