develooper Front page | perl.perl4lib | Postings from January 2009

Re: How to convert from ANSEL/MARC-8 to UTF-8?

Thread Previous | Thread Next
From:
Galen Charlton
Date:
January 7, 2009 08:47
Subject:
Re: How to convert from ANSEL/MARC-8 to UTF-8?
Message ID:
4659947d0901070847h707093c1m96612ccc5a78752b@mail.gmail.com
Hi,

On Wed, Jan 7, 2009 at 11:42 AM, Michael Lackhoff
<lackhoff@fh-muenster.de> wrote:
> diakritics + base char to the combined character. So I still have two
> characters for e.g. the
> German umlauts. This might be correct UTF-8 but is not useable to
> present in (X)HTML.
> Is there any other option short of  doing it by hand with lots of s///
> for at least the most common
> combinations?

You can use NFC() from Unicode::Normalize to do this (after using
MARC::Charset to do the conversion to UTF-8).

Regards,

Galen
-- 
Galen Charlton
VP, Research & Development, LibLime
galen.charlton@liblime.com
p: 1-888-564-2457 x709
skype: gmcharlt

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About