> I just happened to notice that the perlre man page describes the > POSIX "[:punct:]" character class as being equivalent to the unicode > "\p{IsPunct}" character class. > > I haven't tried to track down the respective standards documents for > POSIX and Unicode to see whether these classes are _supposed_ to be > equivalent over the printable ASCII character set, but when I test them AFAIK there are currently no existing standards defining those equivalences. There has been some discussion about that in Unicode consortium mailing lists, but in fact there are some doubts about the wisdom of stating anything about such equivalences (because the C standards where the :foo: originate have frankly no clue about the more complex property structure of Unicode). The closest upcoming standard is the proposed update to the TR18: http://www.unicode.org/reports/tr18/tr18-8.html, see Annex C. If you say :punct: on a non-Unicode data, you are doing _operating_ _system_ _dependent_ AND _locale_ _dependent_ operation. :punct: and \p{Punct} are (supposed to be) equivalent with Unicode data. > in Perl 5.8.1, they are _not_ equivalent, as the following snippet will > demonstrate: -- Jarkko Hietaniemi <jhi@iki.fi> http://www.iki.fi/jhi/ "There is this special biologist word we use for 'stable'. It is 'dead'." -- Jack CohenThread Previous