develooper Front page | perl.perl4lib | Postings from January 2006

Re: MARC::Record, XML, Koha and utf-8

Mike Rylander
January 4, 2006 10:30
Re: MARC::Record, XML, Koha and utf-8
Message ID:
On 1/4/06, Ed Summers <> wrote:
> I would opt for #2. There is a new version of MARC::Charset available
> which should ease the marc8 <-> utf8 charset translation. Shortly
> there will be a new MARC::File::XML that uses the latest MARC::Charset
> You might be interested in taking a look at how Evergreen is storing
> MARC data. I know that they are using a modded version of
> MARC::File::XML in some capacity. Mike Rylander is on this list, so
> maybe he'll chime in--or otherwise you could drop into
> irc:// and ask him (he's usually there).


We store all MARC as MARCXML and use the LoC MODS xslt to extract
displayable (and hence indexable) data.  Of course that particular
stylesheet is only useful for MARC21 data, and other MARC variants
would reqire their own stylesheets, I believe.

As of today, we're using the marcdump utility from the Yaz toolkit for
transforming MARC-8 encoded iso2709 into utf-8 encoded MARCXML, but it
has some ... issues ... that I believe the new MARC::Record and
MARC::File::XML will fix.

Our "import via z39.50" facility is using MARC::Record and our locally
modified MARC::File::XML that uses the old MARC::Charset.  It works
OK, but I think the new MARC::Charset is going to be a real boon to

In any case, I second Ed's suggestion that you move to MARCXML as much
as possible.  XML is /so/ much more flexable (read: there are many
more tools for it) than iso2709, not to mention that creates a much
lower barrier to entry for those wanting to hack on the guts of the
system.  Also, unless I'm mistaken, you can use Yaz Proxy to transform
MARCXML into iso2709 were you to use that as your external Z39.50

Mike Rylander
GPLS -- PINES Development
Database Developer Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About