develooper Front page | perl.perl4lib | Postings from January 2006

MARC::Record, XML, Koha and utf-8

Thread Previous | Thread Next
From:
Paul POULAIN
Date:
January 4, 2006 06:31
Subject:
MARC::Record, XML, Koha and utf-8
Message ID:
43BBDC46.8000001@free.fr
Hello all,

Koha v3 will definetly be utf-8 (+ use indexdata zebra as backend for 
retrieving records + many other great features).

Most of the stuff for utf-8 seems to be working quite well :
* templates moved to utf-8 (http-equiv)
* translation strings moved to utf-8 (iconv)
* mySQL database (4.1) moved to utf-8 too (some tricks here...)

The last question I have is : how to handle MARC records ?

For instance, in cvs, we have 2 columns in DB, one with raw marc record 
(iso2709, MARC::record->as_usmarc() ) and the other with the XML version 
of the record (using MARC::File::XML). Some MARC datas (title, author, 
publisher...) are duplicated in specific columns of the database to 
avoid decoding a large record when just needing 1 or 2 infos !


I see at least 2 solutions to solve my problem, and would be happy with 
some suggestions :
1- let raw record in MARC-8 and in iso2709, and encode values in utf8 
when needed (=when being shown). That seems a poor solution to me for 2 
reasons :
- The encoding/decoding will have to be done many many times.
- iso2709 format is a poor, old and binary format.

2- get rid with iso2709, use only XML, that is utf-8 natively. I see one 
problem with this : when the record has to be exported, the iso2709 must 
be in MARC-8. Exports can be done rarely or often, in case the library 
has an open z3950 server ! (in this cas, the record should be send in 
MARC-8 if I don't mind)

The second solution seems better to me, as we can do many things with 
XML, more than with iso2709 !
The difficulty being to be sure that MARC::Record will handle them 
correctly, as Koha makes *heavy use* of MARC::Record.
(You may think that using MARC::Record is poor if datas are stored in 
XML ! you are right there. But :
- you can consider MARC as a 3 level only XML
- you can move from MARC to XML and from XML to MARC, so that could be a 
1st step to go 100% XML
- did I say Koha makes heavy use of MARC::Record ? Thus removing it is 
really a problem !
)

Le me know what you think, I continue to investigate.
-- 
Paul POULAIN
Consultant ind├ępendant en logiciels libres
responsable francophone de koha (SIGB libre http://www.koha-fr.org)

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About