develooper Front page | perl.perl5.porters | Postings from June 2010

Unicode 6.0 Beta available

Thread Next
karl williamson
June 3, 2010 07:24
Unicode 6.0 Beta available
Message ID:

I downloaded it, and am pleased that our tools worked properly on it, 
   but I haven't added the new files, nor do I think I should. Three are 
provisional, and we have enough to do in supporting official Unicode 
without worrying about provisional properties; the other maps the new 
emoji symbols back to the ordinal values in Japanese telephony that they 
came from.

Also, there were minimal changes to existing data, which seems to 
confirm my contention that the standard is stabilizing.

Here's a summary from the link above:

     *  adds 2,087 characters, including
           o many new symbols; chief among them are the new emoji 
symbols, especially important for mobile phones
           o 222 new CJK Unified Ideographs in common use in China and Japan
           o three new scripts: Mandaic, Batak, and Brahmi
     * adds new properties and data files
           o new data file, EmojiSources.txt, which maps the emoji 
symbols to their original Japanese telco source sets
           o two new provisional properties for support of Indic 
scripts: IndicMatraCategory and IndicSyllabicCategory
           o new provisional script extension data for use in 
segmentation, regular expressions, and spoof detection
     * amends the text of the Standard
           o many changes to the core specification, listed in D. 
Textual Changes and Character Additions
           o small clarifications of the conformance clauses in UAX #9, 
The Unicode Bidirectional Algorithm, but no significant changes to 
conformance requirements
           o major editorial revisions of UAX #44, Unicode Character 
Database, and UAX #15, Unicode Normalization Forms, but no significant 
changes to conformance requirements
     * provides format improvements, including
           o charts for CJK Compatibility Ideographs are now laid out in 
a multicolumn format showing sources, comparable to the structure of the 
charts for the CJK Unified Ideographs

There are a number of new characters, not really mentioned above. 
Attached is a diff of the complete list.

Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About