develooper Front page | perl.perl5.porters | Postings from February 2015

Re: RFC: /w pattern modifier

Thread Previous
Tom Christiansen
February 8, 2015 15:22
Re: RFC: /w pattern modifier
Message ID:
Karl Williamson <> wrote
   on Sat, 07 Feb 2015 22:51:58 MST: 

>As discussed many months ago, I am implementing \b{...} to allow more 
>boundary types than plain \b.

>The three types that will be in 5.22 are

> * \b{gcb}  grapheme cluster break.  \X is defined as .+?\b{gcb}

> * \b{sb}   sentence break.  Is true if Unicode thinks this is
>            a boundary between two sentences.  It does a decent
>            job of this, but it thinks that "Mr. Jones" is 2
>            sentences.

> * \b{wb}   word break.  Is true if Unicode thinks this is boundary
>	     between two words.

[ . . . ]

> It has now occurred to me that a lot of existing \b uses really would
> work better if they were \b{wb}.  And that can be accomplished without
> having to change every occurrence, by instead having a pattern
> modifier flag, which could be in a 'use re "/w"' which says treat
> plain \b as \b{wb} in its scope.

> I don't see any real use for pretending that \b is any of the other
> break types, so I think this is the only modifier affecting \b that
> would ever make sense.

> I'm not sure how I feel about this, but I thought I should throw it
> out there to garner feedback.

That all sounds quite good to me, Karl.


Thread Previous Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About