On Thu, Oct 29, 2009 at 04:24:53PM +0000, Paul LeoNerd Evans wrote: > On Thu, 29 Oct 2009 17:04:07 +0100 > Abigail <abigail@abigail.be> wrote: > > > Now, don't consider this an argument in favour of having \w match non-ASCII > > characters - but, IMO, if \w can match non-ASCII characters, so should \d. > > This would seem to make the most sense, and be the most predictable. > Either all of them match Unicode, or none of them do. > > If none of them do, then adding Unicode variations might be a nice idea. > > I would suggest > > word digit space > ASCII-only \w \d \s > Includes Unicode \Uw \Ud \Us > > Only \U is already used. And \u. > > Do we have a definitive list anywhere, on a tangential note, of the > remaining unused \x letters? 25% is still available (13 out of 52 upper and lower case ASCII characters). perlrebackslash.pod lists all \x letters in use. It's easy to deduce the unused ones: \F, \i, \I, \j, \J, \m, \M, \o, \O, \q, \T, \y, \Y. \c, \g, \k, \p, \P, \x are "partially available", that is, currently they can only be followed by a limited set of characters, so there's some room for expansion left. \N is partially available in 5.10.x, but taken in blead. AbigailThread Previous | Thread Next