develooper Front page | perl.perl5.porters | Postings from June 2018


Karl Williamson
June 6, 2018 16:55
Message ID: 
11.0 of the Unicode Standard is now available, both the core 
specification and data files. Version 11.0 adds 684 characters, for a 
total of 137,374 characters. These additions include seven new scripts, 
for a total of 146 scripts, as well as 145 new emoji.

The new scripts and characters in Version 11.0 add support for 
lesser-used languages and unique written requirements worldwide, including:

  * Georgian Mtavruli capital letters, newly added to support modern
    casing practices
  * Hanifi Rohingya, used to write the modern Rohingya language in
    Southeast Asia
  * Medefaidrin, used for modern liturgical purposes in Africa
  * Mazahua, a Mesoamerican language recognized by law in Mexico
  * Mayan numerals used in printed materials in Central America
  * Historic Sanskrit, Gurmukhi, and the Buryats
  * Five urgently needed CJK unified ideographs: three for chemical
    names and two for Japan's government administration

Popular symbol additions:

  * Copyleft symbol
  * Half stars for rating systems
  * More astrological symbols
  * Xiangqi Chinese chess symbols
  * New emoji characters including:

🦸 	👨🏽‍🦰
🧸 	🦞
🧨 	🥳

For the full list of emoji characters, see emoji additions for Unicode 
11.0 <>, and 
Emoji Counts 
<>. For a 
detailed description of support for emoji characters by the Unicode 
Standard, see UTS #51, Unicode Emoji 
<>. Version 11.0 also 
includes other improvements for emoji handling:

  * a mechanism to request the glyph direction for emoji
  * descriptions of the four new emoji hair components
  * descriptions of gender neutral emoji
  * simplified statements of emoji-related rules for grapheme cluster
    boundaries and for word boundaries.

Three other important Unicode specifications have been updated for 
Version 11.0:

  * UTS #10, Unicode Collation Algorithm
    <> — sorting Unicode
  * UTS #39, Unicode Security Mechanisms
    <> — reducing
    Unicode spoofing
  * UTS #46, Unicode IDNA Compatibility Processing
    <> — compatible
    processing of non-ASCII URLs

Unicode 11.0 includes a number of changes. Some of the Unicode Standard 
Annexes have modifications, often in coordination with changes to 
character properties. In particular, there are changes to:

  * UAX #14, Unicode Linebreaking Algorithm
  * UAX #29, Unicode Text Segmentation
  * UAX #31, Unicode Identifier and Pattern Syntax

The Unicode Standard is the foundation for all modern software and 
communications around the world, including all modern operating systems, 
browsers, laptops, and smart phones—plus the Internet and Web (URLs, 
HTML, XML, CSS, JSON, etc.). The Unicode Standard, its associated 
standards, and data form the foundation for CLDR and ICU releases.


All the new characters including the new emoji are now available for 
adoption to help the Unicode Consortium’s work on digitally 
disadvantaged languages.

[emoji image] <> Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About