develooper Front page | perl.beginners | Postings from April 2008

Advice on how to approach character translation

Thread Next
From:
R Chandrasekhar
Date:
April 23, 2008 02:34
Subject:
Advice on how to approach character translation
Dear Folks,

A scheme called ITRANS uses the ASCII printing character set and between one and 
  three printing characters to unambiguously represent characters in Indic 
scripts or a Romanized script called IAST. Since characters in these scripts 
have Unicode code points, it should be possible to automate the translation 
between words in the ASCII source text and the desired Unicoded output text.

I am trying to write a Perl script to do this and would appreciate advice on how 
best to proceed before I start.

To give a better picture of what I am trying to do, I have given some examples 
below for ASCII to IAST characters:

--------
1. Transliteration of between one and three ASCII printing characters to one 
Unicode character.

2. Many characters are unchanged by the transliteration.

3. Some transliteration examples are shown below:

a       a   U+0061   LATIN SMALL LETTER A
aa      ā   U+0101   LATIN SMALL LETTER A WITH MACRON
A       ā   U+0101   LATIN SMALL LETTER A WITH MACRON
.a      '   U+0027   APOSTROPHE
~N      ṅ   U+1E45   LATIN SMALL LETTER N WITH DOT ABOVE
RRI     ṝ   U+1E5D   LATIN SMALL LETTER R WITH DOT BELOW AND MACRON
R^I	ṝ   U+1E5D   LATIN SMALL LETTER R WITH DOT BELOW AND MACRON
--------

Many thanks.

Chandra

Thread Next


Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About