develooper Front page | perl.perl5.porters | Postings from September 2005

Re: [perl #36953] Uppercase & Lowercase is not working on Turkish Characters

Thread Previous
From:
Dominic Dunlop
Date:
September 30, 2005 11:52
Subject:
Re: [perl #36953] Uppercase & Lowercase is not working on Turkish Characters
Message ID:
572693E2-A74D-4348-9A22-EFFAD4DCDBE8@mac.com
On 2005–09–30, at 14:57, Guest via RT wrote:

> Is there any update on this? Maybe someone can tell me where to  
> look to
> fix this problem.

Following the thread of mails for this bug report shows you can get

ABĞI abğii̇

which is correct, except that you want to lower-case the final LATIN  
CAPITAL LETTER I WITH DOT ABOVE in your input string to LATIN SMALL  
LETTER DOTLESS I. Although I make no claim whatever to being an  
expert, this mapping is problematic: see <http://www.unicode.org/ 
Public/UNIDATA/CaseFolding.txt> where this issue has a special case  
all of its own. In particular, the document says

#    The mappings with status T [special case for uppercase I and  
dotted uppercase I] can be used or omitted depending on the desired  
case-folding
#    behavior. (The default option is to exclude them.)

Perl -- directed by the locale supplied by your system -- seems to be  
excluding this case, but instead implementing the full case-folding  
specified in the document by delivering LATIN SMALL LETTER I followed  
by COMBINING DOT ABOVE.

I can think of two ways that you can get the case-folding that you want:

1. Find or create a locale definition that does case conversion the  
way you want it. I fear that your Linux system probably does not have  
one (but see what  locale -a | grep tr  throws up). I have not been  
able to find such a definition on the Internet, but then I was using  
English keywords for the search -- using Turkish might be more  
rewarding.

2. Have perl do full case folding, then fix up the special cases with  
regex substitutions.

Better ideas, anybody?
-- 
Dominic Dunlop


Thread Previous


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About