develooper Front page | perl.perl4lib | Postings from January 2006

Re: MARC::Record, XML, Koha and utf-8 - JIT iso2709

From:
Enrico Silterra
Date:
January 12, 2006 10:30
Subject:
Re: MARC::Record, XML, Koha and utf-8 - JIT iso2709
Message ID:
5.2.1.1.2.20060112132247.03125620@postoffice10.mail.cornell.edu
Paul, I am sure other people have suggested this:
My personal inclination is to use the XML as the canonical representation,
and make sure that is always present and correct,together with a creation 
date/time.

If iso2709 is needed, then make the translation, and then cache the 
translation together
with its creation date/time.

That way you can always tell if the iso2709 is out of date, it will be 
created only on demand.

I guess if you expected the z39.50 interface to be the most heavily used - 
Is that possible?
- you could do the work the other way around.

Just my 2 cents,
Rick Silterra



At 03:31 PM 1/4/2006 +0100, Paul POULAIN wrote:
>Hello all,
>
>Koha v3 will definetly be utf-8 (+ use indexdata zebra as backend for 
>retrieving records + many other great features).
>
>Most of the stuff for utf-8 seems to be working quite well :
>* templates moved to utf-8 (http-equiv)
>* translation strings moved to utf-8 (iconv)
>* mySQL database (4.1) moved to utf-8 too (some tricks here...)
>
>The last question I have is : how to handle MARC records ?
>
>For instance, in cvs, we have 2 columns in DB, one with raw marc record 
>(iso2709, MARC::record->as_usmarc() ) and the other with the XML version 
>of the record (using MARC::File::XML). Some MARC datas (title, author, 
>publisher...) are duplicated in specific columns of the database to avoid 
>decoding a large record when just needing 1 or 2 infos !
>
>
>I see at least 2 solutions to solve my problem, and would be happy with 
>some suggestions :
>1- let raw record in MARC-8 and in iso2709, and encode values in utf8 when 
>needed (=when being shown). That seems a poor solution to me for 2 reasons :
>- The encoding/decoding will have to be done many many times.
>- iso2709 format is a poor, old and binary format.
>
>2- get rid with iso2709, use only XML, that is utf-8 natively. I see one 
>problem with this : when the record has to be exported, the iso2709 must 
>be in MARC-8. Exports can be done rarely or often, in case the library has 
>an open z3950 server ! (in this cas, the record should be send in MARC-8 
>if I don't mind)
>
>The second solution seems better to me, as we can do many things with XML, 
>more than with iso2709 !
>The difficulty being to be sure that MARC::Record will handle them 
>correctly, as Koha makes *heavy use* of MARC::Record.
>(You may think that using MARC::Record is poor if datas are stored in XML 
>! you are right there. But :
>- you can consider MARC as a 3 level only XML
>- you can move from MARC to XML and from XML to MARC, so that could be a 
>1st step to go 100% XML
>- did I say Koha makes heavy use of MARC::Record ? Thus removing it is 
>really a problem !
>)
>
>Le me know what you think, I continue to investigate.
>--
>Paul POULAIN
>Consultant ind├ępendant en logiciels libres
>responsable francophone de koha (SIGB libre http://www.koha-fr.org)

******************************
Enrico Silterra
Meta Data Engineer
107-E Olin Library
Cornell University
Ithaca NY 14853

Voice: 607-255-6851
Fax:     607-255-6110
E-mail: es287@cornell.edu
http://www.library.cornell.edu/cts/
******************************  


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About