develooper Front page | perl.perl5.porters | Postings from January 2001

definitions for \w, [:...:] classes

Thread Next
Jeff Pinyan
January 22, 2001 08:28
definitions for \w, [:...:] classes
Message ID:
Is there a function, or otherwise reliable method, for determining what \w
matches, given a locale?

I mean, I can assume the following under MY current locale:

  [:...:]	translation

  alpha		a-zA-Z
  alnum		a-zA-Z0-9
  ascii		\000-\177
  cntrl		\000-\037
  digit		0-9
  graph		alpha + digit + punct
  lower		a-z
  print		alpha + digit + punct + space
  punct		`~!@#$%^&*()-=_+[]\{}|;':",./<>?
  space		\040\n\r\t\f
  upper		A-Z
  word		a-zA-Z0-9_
  xdigit	a-fA-F0-9

But what is the correctness and universality of my assumptions?  And which
of these are modifier under different locales?  Is [:word:] (which is a
perl extension) an alias to \w, or is it specifically [:alpha:][:digit:]_?

I am asking for this because I'm trying to incorporate a "wasted
modifier" feature in my YAPE::Regex regular expression parser.  The
feature will allow a user to determine if they used the /s, /m, or /i
modifier in a regex wastefully.

The /i modifier is the kicker -- I need to catch things like /[a-zA-Z]/i
(which is a wasted use of /i), as well as /[[:lower:][:word:]]/i (which is
also a waste).

The true test will be if I can catch /(abc|abC|aBc|aBC|Abc|AbC|ABc|ABC)/i.


Jeff "japhy" Pinyan
CPAN - #1 Perl Resource  (my id:  PINYAN)
PerlMonks - An Online Perl Community
The Perl Archive - Articles, Forums, etc.

Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About