develooper Front page | perl.perl5.porters | Postings from March 2008

[PATCH] RE: [perl #49302] [[:print:]] v \p{Print}

Thread Previous | Thread Next
From:
Robin Barker
Date:
March 31, 2008 13:42
Subject:
[PATCH] RE: [perl #49302] [[:print:]] v \p{Print}
Message ID:
46A0F33545E63740BC7563DE59CA9C6D093ABD@exchsvr2.npl.ad.local
No progress in resolving this in code, so here is documentation patch.

Robin

--- pod/perlre.pod.orig	2008-01-30 20:41:06.000000000 +0000
+++ pod/perlre.pod
@@ -375,8 +375,8 @@
     digit       IsDigit        \d
     graph       IsGraph
     lower       IsLower
-    print       IsPrint
-    punct       IsPunct
+    print       IsPrint		(but see 2. below)
+    punct       IsPunct		(but see 3. below)
     space       IsSpace
                 IsSpacePerl    \s
     upper       IsUpper
@@ -385,6 +385,41 @@
 
 For example C<[[:lower:]]> and C<\p{IsLower}> are equivalent.
 
+However, the equivalence between C<[[:xxxxx:]]> and C<\p{Xxxxx}> is not exact.
+
+=over 4
+
+=item 1.
+
+C<[[:xxxxx:]]> only matches characters in the range 0x00-0x7F.
+
+=item 2.
+
+C<\p{IsPrint}> matches characters 0x09-0x0d but C<[[:print:]]> does not.
+
+=item 3.
+
+C<[[:punct::]]> matches the following but C<\p{IsPunct}> does not,
+because they are classed as symbols in Unicode.
+
+=over 4
+
+=item C<$>
+
+Currency symbol
+
+=item C<+> C<< < >> C<=> C<< > >> C<|> C<~>
+
+Mathematical symbols
+
+=item C<^> C<`>
+
+Modifier symbols (accents)
+
+=back
+
+=back
+
 If the C<utf8> pragma is not used but the C<locale> pragma is, the
 classes correlate with the usual isalpha(3) interface (except for
 "word" and "blank").

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About