On Thursday 14 June 2001 12:01 pm, Dan Sugalski wrote: > Fancy character classes are probably enough to handle the various casing > issues and their analogs. They're probably not enough to handle things > like the arabic tatwheel, or proper word breaks in most asian languages. > Heck, unless I'm missing something, they're insufficient for something as > simple as \d. > > I'm not advocating forcing dictionaries into the regex engine, nor even > shipping them with the core. That's not to say that some Locale::* couldn't include one, or reference a third party one. > As I see it, locales specify: > > * Collating order > * Comparison/equality specification > * Unicode codepoint interpretation What do you mean by that? > * Regex character classes > * Regex character identification > * Regex zero-width assertion rules > * 'casing' rules > > It'd be nice to specify them all separately and inherit the ones you don't > need to change from some parent locale. Or have these individual bits and pieces be addressable through the regexen, and have locales *defined* via that. module Locale::Hawaiian; use re 'class (\w => [aeiouâêîôûhklmnpw`])'; ... On a side note (and this *will* sound stupid, but there is a reason I'm asking). Why is there no logical opposite to '.'; that is, a character which never matches another character? (Besides, of course, that it's utterly useless from a classic regex perspective.) -- Bryan C. Warnock bwarnock@capita.comThread Previous | Thread Next