develooper Front page | perl.perl4lib | Postings from July 2008

Re: Biblio::Isis and character encoding

Thread Previous | Thread Next
From:
Dobrica Pavlinusic
Date:
July 15, 2008 02:41
Subject:
Re: Biblio::Isis and character encoding
On Mon, Jul 14, 2008 at 09:14:50AM +0200, Emmanuel Di Pretoro wrote:
> Hi,
> 
> Currently I'm trying to convert an ISIS database to MARC21. So I use
> Biblio::Isis and MARC::Record to do that. No problem with this conversion,
> except for some weird character encoding problems. Some bibliographic
> records are in written in french, and accentuated characters like 'é' are
> display as '<82>'.
> 
> I've tried to use some Encode::* modules (Encode, Encode::Guess,
> Encode::Detec, Encode::First), but without success.
> 
> Is there anybody who have this kind of problem? Is there a solution?

Biblio::Isis doesn't have any support for encoding. It will return
content with original encoding from ISIS. This is intentional, because our
local encoding was really wired.

In our project WebPAC (which was reason to write Biblio::Isis in the
first place :-) we are using Encode's from_to and/or decode to convert our local
encoding to utf-8 which MARC::Record (2.0 and newer) handles well.

See http://webpac.us/ for documentation or this snippet:
http://svn.rot13.org/index.cgi/webpac2/view/trunk/lib/WebPAC/Output/MARC.pm

p.s. WebPAC(2) is really universal conversion tool (data mangler :-) but
it might be overkill for your purpose (or not). It also includes
several DSL [domain specific languages] based on perl to massage data
before producing output.

HTH.

-- 
Dobrica Pavlinusic               2share!2flame            dpavlin@rot13.org
Unix addict. Internet consultant.             http://www.rot13.org/~dpavlin


Thread Previous | Thread Next


Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About