develooper Front page | perl.perl5.porters | Postings from April 2011

charnames and CJK

Thread Next
From:
Tom Christiansen
Date:
April 26, 2011 15:49
Subject:
charnames and CJK
Message ID:
6799.1303858127@chthon
Cool, we get CJK (quasi-)names now.

    % perl5.12.3 -Mcharnames=:full -E 'say charnames::viacode(0x547c) || "<missing>"'
    <missing>
    %# blead -Mcharnames=:full -E 'say charnames::viacode(0x547c) || "<missing>"'
    CJK UNIFIED IDEOGRAPH-547C

    % perl -Mcharnames=:full -wE 'printf "U+%X\n", ord "\N{CJK UNIFIED IDEOGRAPH-547C}"'
    Unknown charname 'CJK UNIFIED IDEOGRAPH-547C' at /usr/local/lib/perl5/5.12.3/unicore/Name.pl line 1
    U+FFFD
    % blead -Mcharnames=:full -wE 'printf "U+%X\n", ord "\N{CJK UNIFIED IDEOGRAPH-547C}"'
    U+547C

It's not a new code point at all:

    % blead uniprops -a 'CJK UNIFIED IDEOGRAPH-547C' 
    U+547C ‹呼› \N{CJK UNIFIED IDEOGRAPH-547C}
	\w \pL \p{L_} \p{Lo}
	All Any Alnum Alpha Alphabetic Assigned InCJK_UnifiedIdeographs CJK_Unified_Ideographs L Lo Gr_Base Grapheme_Base Graph
	   GrBase Han Hani ID_Continue IDC ID_Start IDS Ideo Ideographic Letter L_ Other_Letter Print UIdeo Unified_Ideograph
	   Word XID_Continue XIDC XID_Start XIDS X_POSIX_Alnum X_POSIX_Alpha X_POSIX_Graph X_POSIX_Print X_POSIX_Word
	Age=1.1 Bidi_Class=L Bidi_Class=Left_To_Right BC=L Block=CJK_Unified_Ideographs Canonical_Combining_Class=0
	   Canonical_Combining_Class=Not_Reordered CCC=NR Canonical_Combining_Class=NR Decomposition_Type=None DT=None
	   East_Asian_Width=W East_Asian_Width=Wide EA=W Grapheme_Cluster_Break=Other GCB=XX Grapheme_Cluster_Break=XX Script=Han
	   Hangul_Syllable_Type=NA Hangul_Syllable_Type=Not_Applicable HST=NA Joining_Group=No_Joining_Group JG=NoJoiningGroup
	   Joining_Type=Non_Joining JT=U Joining_Type=U Line_Break=ID Line_Break=Ideographic LB=ID Numeric_Type=None NT=None
	   Numeric_Value=NaN NV=NaN Present_In=1.1 IN=1.1 Present_In=2.0 IN=2.0 Present_In=2.1 IN=2.1 Present_In=3.0 IN=3.0
	   Present_In=3.1 IN=3.1 Present_In=3.2 IN=3.2 Present_In=4.0 IN=4.0 Present_In=4.1 IN=4.1 Present_In=5.0 IN=5.0
	   Present_In=5.1 IN=5.1 Present_In=5.2 IN=5.2 Present_In=6.0 IN=6.0 SC=Han Script=Hani Sentence_Break=LE
	   Sentence_Break=OLetter SB=LE Word_Break=Other WB=XX Word_Break=XX _X_Begin

Since it was already here in 1.1, I figure charnames just acts
differently now.  Is that right?

--tom

Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About