develooper Front page | perl.perl5.porters | Postings from January 2001

definitions for \w, [:...:] classes

Thread Next
From:
Jeff Pinyan
Date:
January 22, 2001 08:28
Subject:
definitions for \w, [:...:] classes
Message ID:
Pine.GSO.4.21.0101221116030.16416-100000@crusoe.crusoe.net
Is there a function, or otherwise reliable method, for determining what \w
matches, given a locale?

I mean, I can assume the following under MY current locale:

  [:...:]	translation

  alpha		a-zA-Z
  alnum		a-zA-Z0-9
  ascii		\000-\177
  cntrl		\000-\037
  digit		0-9
  graph		alpha + digit + punct
  lower		a-z
  print		alpha + digit + punct + space
  punct		`~!@#$%^&*()-=_+[]\{}|;':",./<>?
  space		\040\n\r\t\f
  upper		A-Z
  word		a-zA-Z0-9_
  xdigit	a-fA-F0-9

But what is the correctness and universality of my assumptions?  And which
of these are modifier under different locales?  Is [:word:] (which is a
perl extension) an alias to \w, or is it specifically [:alpha:][:digit:]_?

I am asking for this because I'm trying to incorporate a "wasted
modifier" feature in my YAPE::Regex regular expression parser.  The
feature will allow a user to determine if they used the /s, /m, or /i
modifier in a regex wastefully.

The /i modifier is the kicker -- I need to catch things like /[a-zA-Z]/i
(which is a wasted use of /i), as well as /[[:lower:][:word:]]/i (which is
also a waste).

The true test will be if I can catch /(abc|abC|aBc|aBC|Abc|AbC|ABc|ABC)/i.

(Yikes.)

-- 
Jeff "japhy" Pinyan     japhy@pobox.com    http://www.pobox.com/~japhy/
CPAN - #1 Perl Resource  (my id:  PINYAN)       http://search.cpan.org/
PerlMonks - An Online Perl Community          http://www.perlmonks.com/
The Perl Archive - Articles, Forums, etc.   http://www.perlarchive.com/


Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About