develooper Front page | perl.perl5.porters | Postings from November 2010

RFC: Unicode::UCD::propval() return the property value of a codepoint

From:
karl williamson
Date:
November 17, 2010 19:01
Subject:
RFC: Unicode::UCD::propval() return the property value of a codepoint
Message ID:
4CE496F2.8020601@khwilliamson.com
The following is the beginnings of a pod entry for this proposed 
function.  Some things are still sketchy.

=head2 propval

C<propval> returns the value of the input Unicode property applied to the
input code point.  I'm not particularly fond of this name, and 
suggestions for
others are welcome.

For example,

  print Unicode::UCD::propval('Gc', ord 'A'), "\n";

would print 'Lu'.  It will recognize any property name following Unicode 
loose
matching rules, including all the synonyms that Unicode specifies for them.
Thus

  print Unicode::UCD::propval('general category', ord 'A'), "\n";

is the same as the earlier example.

Note that there are multiple possible synonyms for many of the returned
property values.  For example, 'Uppercase Letter' is a synonym of 'Lu'.  The
value that gets returned is the one that Unicode uses in its files.
Suggestions for an API to specify wanting others are welcome.  The only 
one I
can think of is in list context, it returns a list of all the distinct ones.
Mostly there are just two values, a long and short.  But there are a few 
with
multiple ones.

All Unicode properties have values for all code points.  Thus the only time
this function returns C<undef> is if called with an unknown property (a
warning is also raised with this), or called with a non-Unicode code-point,
i.e., greater than 0x10FFFF.



nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About