We decided a couple of releases ago that eventually we would treat as ignored white-space under /x all the Unicode white-space characters that they have specified for such a purpose, namely those matching the property \p{Pattern White Space}. There are 11 code points in this property, and that's guaranteed to never change. (As an aside, if they really want to change this, they would introduce a new property, something like \p{XPatWS}, and encourage people to migrate to it) Fortunately, the set of code points that Perl accepted under /x for white-space is a proper subset of what Unicode suggests. The 5 missing ones are U+0085 NEXT LINE U+200E LEFT-TO-RIGHT MARK U+200F RIGHT-TO-LEFT MARK U+2028 LINE SEPARATOR U+2029 PARAGRAPH SEPARATOR Two of these are for rudimentary processing for languages that are written Right-to-Left, but the other three are all intended to start (at least) a new line. Releases 5.18 and 5.20 raise a default-on deprecation warning when any of these 5 characters are used as literals in a /x pattern. That means that in 5.22 we can change to skip them under /x. In implementing this, I realized that it seems to be the right thing to do to end a comment not just with a \n, but any of these three that indicate a new-line. But I want to give a chance for dissenting opinions. One might argue that any of the vertical white space controls should end a comment, FF, VT, and especially CR. All of these are considered \R (linebreak), and so it makes sense. But it has worked the other way for a long time without apparent problem, so I think we should just leave these as-is. There is a minor glitch, as the still-experimental (?[ ]) regex sets code was added allowing all of the pattern white space characters. But for the comment ending it uses anything that matches \R, instead of what I'm proposing here. Since this is experimental, we can change it any way that is convenient.Thread Next