perl.perl4lib http://www.nntp.perl.org/group/perl.perl4lib/ Requesting help with simple MARC::File::XML program by Anne L. Highsmith PHA+RnJvbTogQW5uZSBMLiBIaWdoc21pdGgKCkkgaGF2ZSBhIGZpbGUgb2YgdXNtYXJjIHJlY29yZHMgd2hpY2ggSSB3YW50IHRvIHJlYWQgaW50byBhIHByb2dyYW0gYW5kIHByaW50IHRvIGEgZmlsZSBhcyBNQVJDIHhtbC4gSGVyZSYjMzk7cyBteSBwcm9ncmFtIHNvIGZhcjo8YnIvPjxici8+IyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjPGJyLz4jIS91c3IvbG9jYWwvYmluL3Blcmw8YnIvPnVzZSBzdHJpY3Q7PGJyLz51c2Ugd2FybmluZ3M7PGJyLz51c2UgTUFSQzo6UmVjb3JkOzxici8+dXNlIE1BUkM6OkJhdGNoOzxici8+dXNlIE1BUkM6OkZpbGU6OlhNTDs8YnIvPjxici8+bXkgJGluZmlsZSA9ICYjMzk7dXBkYXRlZF9kaXNzZXJ0YXRpb25fcmVjb3JkcyYjMzk7Ozxici8+bXkgJGZpbGUgPSBNQVJDOjpGaWxlOjpYTUwtJmd0O291dCgmIzM5O3VwZGF0ZWRfZGlzc2VydGF0aW9uX3JlY29yZHMueG1sJiMzOTssICYjMzk7VVRGLTgmIzM5OyApOzxici8+PGJyLz5teSAkYmF0Y2ggPSBNQVJDOjpCYXRjaC0mZ3Q7bmV3KCAmIzM5O1VTTUFSQyYjMzk7LCAkaW5maWxlKTs8YnIvPmZvciAobXkgJGkgPSAwOyAkaSAmbHQ7IDM7ICRpKyspIHs8YnIvPiAgICAgICAgbXkgJHJlY29yZCA9ICRiYXRjaC0mZ3Q7bmV4dCgpOzxici8+ICAgICAgICAkZmlsZS0mZ3Q7d3JpdGUoJHJlY29yZCk7PGJyLz59PGJyLz4jIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyM8YnIvPjxici8+Rmlyc3QgcXVlc3Rpb24gLS0gSSBjYW4mIzM5O3QgZ2V0IHBhc3QgdGhlIG15ICRmaWxlID0gTUFSQzo6RmlsZTo6WE1MLSZndDtvdXQoJiMzOTt1cGRhdGVkX2Rpc3NlcnRhdGlvbl9yZWNvcmRzLnhtbCYjMzk7LCAmIzM5O1VURi04JiMzOTsgKTs8YnIvPnN0YXRlbWVudC4gIFdoZW4gSSBydW4gdGhlIHByb2dyYW0sIHRoYXQgbGluZSBnZXRzIHRoZSBlcnJvcjo8YnIvPnVzYWdlICRmaC0mZ3Q7YmlubW9kZShbTEFZRVJdKSBhdCAvdXNyL2xvY2FsL3BlcmwvNS44L2xpYi9zaXRlX3BlcmwvNS44LjcvTUFSQy9GaWxlL1hNTC5wbSBsaW5lIDE5NTxici8+PGJyLz5Ud28gLS0gZG9lcyB0aGUgJnF1b3Q7JGZpbGUtJmd0O3dyaXRlKCkmcXVvdDsgc3RhdGVtZW50IGV4cGVjdCBhIE1BUkM6OlJlY29yZCBvYmplY3QsIGFuIFhNTCBzdHJlYW0sIG9yIGEgbWFyYyByZWNvcmQgc3RyaW5nPyBJJiMzOTt2ZSBhc3N1bWVkIGEgTUFSQzo6UmVjb3JkIG9iamVjdC4gSWYgaXQgZXhwZWN0cyBhbiBYTUwgc3RyZWFtLCB3aGF0IGlzIHRoZSBiZXN0IG1ldGhvZCBmb3Igd2hhdCBJJiMzOTttIHRyeWluZyB0byBkbyB0byBnZXQgbXkgbWFyYyByZWNvcmQgaW50byBhbiBhcHByb3ByaWF0ZSBYTUwgc3RyZWFtPzxici8+PGJyLz48L3A+ 2008-07-20T20:20:05Z Re: Biblio::Isis and character encoding by Dobrica Pavlinusic PHA+RnJvbTogRG9icmljYSBQYXZsaW51c2ljCgpPbiBNb24sIEp1bCAxNCwgMjAwOCBhdCAwOToxNDo1MEFNICswMjAwLCBFbW1hbnVlbCBEaSBQcmV0b3JvIHdyb3RlOjxici8+Jmd0OyBIaSw8YnIvPiZndDsgPGJyLz4mZ3Q7IEN1cnJlbnRseSBJJiMzOTttIHRyeWluZyB0byBjb252ZXJ0IGFuIElTSVMgZGF0YWJhc2UgdG8gTUFSQzIxLiBTbyBJIHVzZTxici8+Jmd0OyBCaWJsaW86OklzaXMgYW5kIE1BUkM6OlJlY29yZCB0byBkbyB0aGF0LiBObyBwcm9ibGVtIHdpdGggdGhpcyBjb252ZXJzaW9uLDxici8+Jmd0OyBleGNlcHQgZm9yIHNvbWUgd2VpcmQgY2hhcmFjdGVyIGVuY29kaW5nIHByb2JsZW1zLiBTb21lIGJpYmxpb2dyYXBoaWM8YnIvPiZndDsgcmVjb3JkcyBhcmUgaW4gd3JpdHRlbiBpbiBmcmVuY2gsIGFuZCBhY2NlbnR1YXRlZCBjaGFyYWN0ZXJzIGxpa2UgJiMzOTsmZWFjdXRlOyYjMzk7IGFyZTxici8+Jmd0OyBkaXNwbGF5IGFzICYjMzk7Jmx0OzgyJmd0OyYjMzk7Ljxici8+Jmd0OyA8YnIvPiZndDsgSSYjMzk7dmUgdHJpZWQgdG8gdXNlIHNvbWUgRW5jb2RlOjoqIG1vZHVsZXMgKEVuY29kZSwgRW5jb2RlOjpHdWVzcyw8YnIvPiZndDsgRW5jb2RlOjpEZXRlYywgRW5jb2RlOjpGaXJzdCksIGJ1dCB3aXRob3V0IHN1Y2Nlc3MuPGJyLz4mZ3Q7IDxici8+Jmd0OyBJcyB0aGVyZSBhbnlib2R5IHdobyBoYXZlIHRoaXMga2luZCBvZiBwcm9ibGVtPyBJcyB0aGVyZSBhIHNvbHV0aW9uPzxici8+PGJyLz5CaWJsaW86OklzaXMgZG9lc24mIzM5O3QgaGF2ZSBhbnkgc3VwcG9ydCBmb3IgZW5jb2RpbmcuIEl0IHdpbGwgcmV0dXJuPGJyLz5jb250ZW50IHdpdGggb3JpZ2luYWwgZW5jb2RpbmcgZnJvbSBJU0lTLiBUaGlzIGlzIGludGVudGlvbmFsLCBiZWNhdXNlIG91cjxici8+bG9jYWwgZW5jb2Rpbmcgd2FzIHJlYWxseSB3aXJlZC48YnIvPjxici8+SW4gb3VyIHByb2plY3QgV2ViUEFDICh3aGljaCB3YXMgcmVhc29uIHRvIHdyaXRlIEJpYmxpbzo6SXNpcyBpbiB0aGU8YnIvPmZpcnN0IHBsYWNlIDotKSB3ZSBhcmUgdXNpbmcgRW5jb2RlJiMzOTtzIGZyb21fdG8gYW5kL29yIGRlY29kZSB0byBjb252ZXJ0IG91ciBsb2NhbDxici8+ZW5jb2RpbmcgdG8gdXRmLTggd2hpY2ggTUFSQzo6UmVjb3JkICgyLjAgYW5kIG5ld2VyKSBoYW5kbGVzIHdlbGwuPGJyLz48YnIvPlNlZSBodHRwOi8vd2VicGFjLnVzLyBmb3IgZG9jdW1lbnRhdGlvbiBvciB0aGlzIHNuaXBwZXQ6PGJyLz5odHRwOi8vc3ZuLnJvdDEzLm9yZy9pbmRleC5jZ2kvd2VicGFjMi92aWV3L3RydW5rL2xpYi9XZWJQQUMvT3V0cHV0L01BUkMucG08YnIvPjxici8+cC5zLiBXZWJQQUMoMikgaXMgcmVhbGx5IHVuaXZlcnNhbCBjb252ZXJzaW9uIHRvb2wgKGRhdGEgbWFuZ2xlciA6LSkgYnV0PGJyLz5pdCBtaWdodCBiZSBvdmVya2lsbCBmb3IgeW91ciBwdXJwb3NlIChvciBub3QpLiBJdCBhbHNvIGluY2x1ZGVzPGJyLz5zZXZlcmFsIERTTCBbZG9tYWluIHNwZWNpZmljIGxhbmd1YWdlc10gYmFzZWQgb24gcGVybCB0byBtYXNzYWdlIGRhdGE8YnIvPmJlZm9yZSBwcm9kdWNpbmcgb3V0cHV0Ljxici8+PGJyLz5IVEguPGJyLz48YnIvPi0tIDxici8+RG9icmljYSBQYXZsaW51c2ljICAgICAgICAgICAgICAgMnNoYXJlITJmbGFtZSAgICAgICAgICAgIGRwYXZsaW5Acm90MTMub3JnPGJyLz5Vbml4IGFkZGljdC4gSW50ZXJuZXQgY29uc3VsdGFudC4gICAgICAgICAgICAgaHR0cDovL3d3dy5yb3QxMy5vcmcvfmRwYXZsaW48YnIvPjxici8+PC9wPg== 2008-07-15T02:41:26Z RE: Biblio::Isis and character encoding by Doran, Michael D <p>From: Doran, Michael D Hi Emmanuel, <br/> <br/>&gt; I&#39;m trying to convert an ISIS database to MARC21 <br/> <br/>What is the character set encoding of the data in the ISIS database? <br/> <br/>What is the desired character set encoding for the MARC21 records? I.e. MARC-8 or MARC Unicode(UTF-8)? <br/> <br/>If they are dissimilar character encodings, is the data undergoing a character set conversion? <br/> <br/>-- Michael <br/> <br/># Michael Doran, Systems Librarian <br/># University of Texas at Arlington <br/># 817-272-5326 office <br/># 817-688-1926 mobile <br/># doran@uta.edu <br/># http://rocky.uta.edu/doran/ <br/> <br/> <br/>&gt; -----Original Message----- <br/>&gt; From: Emmanuel Di Pretoro [mailto:edipretoro@gmail.com] <br/>&gt; Sent: Monday, July 14, 2008 2:15 AM <br/>&gt; To: perl4lib@perl.org <br/>&gt; Subject: Biblio::Isis and character encoding <br/>&gt; <br/>&gt; Hi, <br/>&gt; <br/>&gt; Currently I&#39;m trying to convert an ISIS database to MARC21. So I use <br/>&gt; Biblio::Isis and MARC::Record to do that. No problem with this conversion, <br/>&gt; except for some weird character encoding problems. Some bibliographic <br/>&gt; records are in written in french, and accentuated characters like &#39;&Atilde;&copy;&#39; are <br/>&gt; display as &#39;&lt;82&gt;&#39;. <br/>&gt; <br/>&gt; I&#39;ve tried to use some Encode::* modules (Encode, Encode::Guess, <br/>&gt; Encode::Detec, Encode::First), but without success. <br/>&gt; <br/>&gt; Is there anybody who have this kind of problem? Is there a solution? <br/>&gt; <br/>&gt; Thanks in advance. <br/>&gt; <br/>&gt; Regards, <br/>&gt; <br/>&gt; Emmanuel Di Pretoro <br/></p> 2008-07-14T12:44:07Z Biblio::Isis and character encoding by Emmanuel Di Pretoro PHA+RnJvbTogRW1tYW51ZWwgRGkgUHJldG9ybwoKSGksPGJyLz48YnIvPkN1cnJlbnRseSBJJiMzOTttIHRyeWluZyB0byBjb252ZXJ0IGFuIElTSVMgZGF0YWJhc2UgdG8gTUFSQzIxLiBTbyBJIHVzZTxici8+QmlibGlvOjpJc2lzIGFuZCBNQVJDOjpSZWNvcmQgdG8gZG8gdGhhdC4gTm8gcHJvYmxlbSB3aXRoIHRoaXMgY29udmVyc2lvbiw8YnIvPmV4Y2VwdCBmb3Igc29tZSB3ZWlyZCBjaGFyYWN0ZXIgZW5jb2RpbmcgcHJvYmxlbXMuIFNvbWUgYmlibGlvZ3JhcGhpYzxici8+cmVjb3JkcyBhcmUgaW4gd3JpdHRlbiBpbiBmcmVuY2gsIGFuZCBhY2NlbnR1YXRlZCBjaGFyYWN0ZXJzIGxpa2UgJiMzOTsmZWFjdXRlOyYjMzk7IGFyZTxici8+ZGlzcGxheSBhcyAmIzM5OyZsdDs4MiZndDsmIzM5Oy48YnIvPjxici8+SSYjMzk7dmUgdHJpZWQgdG8gdXNlIHNvbWUgRW5jb2RlOjoqIG1vZHVsZXMgKEVuY29kZSwgRW5jb2RlOjpHdWVzcyw8YnIvPkVuY29kZTo6RGV0ZWMsIEVuY29kZTo6Rmlyc3QpLCBidXQgd2l0aG91dCBzdWNjZXNzLjxici8+PGJyLz5JcyB0aGVyZSBhbnlib2R5IHdobyBoYXZlIHRoaXMga2luZCBvZiBwcm9ibGVtPyBJcyB0aGVyZSBhIHNvbHV0aW9uPzxici8+PGJyLz5UaGFua3MgaW4gYWR2YW5jZS48YnIvPjxici8+UmVnYXJkcyw8YnIvPjxici8+RW1tYW51ZWwgRGkgUHJldG9ybzxici8+PGJyLz48L3A+ 2008-07-14T00:14:57Z Problems installing Yaz 3.0.34 prior to installing pazpar2 by Christopher Morgan PHA+RnJvbTogQ2hyaXN0b3BoZXIgTW9yZ2FuCgpTb3JyeSAtIFRoaXMgaXNuJiMzOTt0IGEgUGVybCBpc3N1ZSAtIEkgaGF2ZSBwb3N0ZWQgaXQgdG8gdGhlIFlheiBsaXN0IGluc3RlYWQuPGJyLz4tIENocmlzPGJyLz48YnIvPkkgaW5zdGFsbGVkIFlheiAzLjAuMzQgdG9kYXkgYW5kIEkgZ290IHRoZSBmb2xsb3dpbmcgY29uZmlndXJhdGlvbjo8YnIvPjxici8+WUFaIFBhY2thZ2U6ICAgICAgICAgICAgICAgIHlhejxici8+ICBZQVogVmVyc2lvbjogICAgICAgICAgICAgICAgMy4wLjM0PGJyLz4gIEJ1Z3JlcG9ydDogICAgICAgICAgICAgICAgICB5YXotaGVscEBpbmRleGRhdGEuZGs8YnIvPiAgU291cmNlIGNvZGUgbG9jYXRpb246ICAgICAgIC48YnIvPiAgQyBQcmVwcm9jZXNzb3I6ICAgICAgICAgICAgIGdjYyAtRTxici8+ICBDIFByZXByb2Nlc3NvciBmbGFnczo8YnIvPiAgQyBDb21waWxlcjogICAgICAgICAgICAgICAgIGdjYzxici8+ICBDIENvbXBpbGVyIGZsYWdzOiAgICAgICAgICAgLWcgLU8yPGJyLz4gIExpbmtlciBmbGFnczo8YnIvPiAgTGlua2VkIGxpYnM6ICAgICAgICAgICAgICAgIC1ML3Vzci9saWIgLWx4c2x0IC1seG1sMiAtbHogLWxwdGhyZWFkIC1sbTxici8+ICBIb3N0IFN5c3RlbSBUeXBlOiAgICAgICAgICAgaTY4Ni1wYy1saW51eC1nbnU8YnIvPiAgSW5zdGFsbCBwYXRoOiAgICAgICAgICAgICAgIC91c3IvbG9jYWw8YnIvPiAgQXV0b21ha2U6ICAgICAgICAgICAgICAgICAgICR7U0hFTEx9PGJyLz4vaG9tZS93ZWJhZG1pbi95YXotMy4wLjM0L2NvbmZpZy9taXNzaW5nIC0tcnVuICAgICBhdXRvbWFrZS0xLjEwPGJyLz4gIEFyY2hpdmVyOiAgICAgICAgICAgICAgICAgICBhcjxici8+ICBSYW5saWI6ICAgICAgICAgICAgICAgICAgICAgcmFubGliPGJyLz48YnIvPkhvd2V2ZXIsIEkgZ290IHRoaXMgZXJyb3IgbWVzc2FnZTo8YnIvPjxici8+Y29uZmlndXJlOiBXQVJOSU5HOiBsaWJFWFNMVCBkZXZlbG9wbWVudCBsaWJyYXJpZXMgbm90IGZvdW5kLiA8YnIvPjxici8+U2luY2UgSSB3YW50IHRvIHRyeSBpbnN0YWxsaW5nIHBhenBhcjIgLS0gYW5kIEkgYmVsaWV2ZSB0aGVzZSBsaWJyYXJpZXMgYXJlPGJyLz5uZWNlc3NhcnkgdG8gZG8gdGhhdCAtLSg/KSwgSSB1bmluc3RhbGxlZCBZYXogdXNpbmcgJnF1b3Q7bWFrZSB1bmluc3RhbGwmcXVvdDssPGJyLz5yZW1vdmVkIHRoZSByZW1haW5pbmcgeWF6IGRpcmVjdG9yaWVzLCBhbmQgdGhlbiB0cmllZCBpbnN0YWxsaW5nPGJyLz5saWJ4c2x0LWRldmVsIGFuZCBsaWJ4bWwyLWRldmVsIHZpYSBycG1zLCBidXQgd2FzIHVuYWJsZSB0by4gbGlieG1sMiBzZWVtZWQ8YnIvPnRvIGluc3RhbGwgY29ycmVjdGx5LCBmb3IgZXhhbXBsZSwgYnV0IGxpYnhtbDItZGV2ZWwgd291bGRuJiMzOTt0IGluc3RhbGwsPGJyLz5jb21wbGFpbmluZyB0aGF0IEkgaGFkbiYjMzk7dCBpbnN0YWxsZWQgbGlieG1sMiAoISk8YnIvPjxici8+V2hlbiBJIHRyaWVkIHRvIHJlaW5zdGFsbCBZYXosIEkgZ290IHRoaXMgZXJyb3IgbWVzc2FnZSBkdXJpbmcgbWFrZTo8YnIvPjxici8+bWFyY2R1bXAubygudGV4dCsweDEyZCk6IEluIGZ1bmN0aW9uIGBtYXJjZHVtcF9yZWFkX3htbCYjMzk7Ojxici8+L2hvbWUvd2ViYWRtaW4veWF6LTMuMC4zNC91dGlsL21hcmNkdW1wLmM6MTEyOiB1bmRlZmluZWQgcmVmZXJlbmNlIHRvPGJyLz5geG1sUmVhZGVyRm9yRmlsZSYjMzk7PGJyLz5jb2xsZWN0MjogbGQgcmV0dXJuZWQgMSBleGl0IHN0YXR1czxici8+bWFrZVsxXTogKioqIFt5YXotbWFyY2R1bXBdIEVycm9yIDE8YnIvPm1ha2VbMV06IExlYXZpbmcgZGlyZWN0b3J5IGAvaG9tZS93ZWJhZG1pbi95YXotMy4wLjM0L3V0aWwmIzM5Ozxici8+bWFrZTogKioqIFthbGwtcmVjdXJzaXZlXSBFcnJvciAxPGJyLz48YnIvPkkgc3VzcGVjdCBzb21ldGhpbmcgZ290IGNvcnJ1cHRlZCwgc28gSSBhbSBnb2luZyB0byByZXNldCB0aGUgVlBTIGFuZCB0cnk8YnIvPmFnYWluLiBEb2VzIGFueW9uZSBoYXZlIGFueSBzdWdnZXN0aW9ucyBhYm91dCB3aGljaCBkZXZlbG9wbWVudCBsaWJyYXJpZXMgdG88YnIvPmluc3RhbGwgYmVmb3JlIGluc3RhbGxpbmcgWWF6LCBpZiB5b3UgcGxhbiB0byBpbnN0YWxsIHBhenBhcjI/IEFyZSB0aGVyZSBhbnk8YnIvPnRhcmJhbGxzIG9mIHRoZSBsaWJyYXJpZXMgYXZhaWxhYmxlIG9ubGluZSByYXRoZXIgdGhhbiBycG1zPyBJJiMzOTttIGFsc28gbm90PGJyLz5jbGVhciBhYm91dCB3aGljaCB2ZXJzaW9ucyBvZiB0aGUgZGV2ZWxvcG1lbnQgbGlicmFyaWVzIHRvIGluc3RhbGwgLS0gdGhlcmU8YnIvPmFyZSBtYW55IGRpZmZlcmVudCB2ZXJzaW9ucyBhdmFpbGFibGUuPGJyLz48YnIvPlRoYW5rcyE8YnIvPjxici8+LSBDaHJpcyBNb3JnYW48YnIvPjxici8+PC9wPg== 2008-07-10T11:58:15Z Problems installing Yaz 3.0.34 prior to installing pazpar2 by Christopher Morgan PHA+RnJvbTogQ2hyaXN0b3BoZXIgTW9yZ2FuCgo8YnIvPkkgaW5zdGFsbGVkIFlheiAzLjAuMzQgdG9kYXkgYW5kIEkgZ290IHRoZSBmb2xsb3dpbmcgY29uZmlndXJhdGlvbjo8YnIvPjxici8+WUFaIFBhY2thZ2U6ICAgICAgICAgICAgICAgIHlhejxici8+ICBZQVogVmVyc2lvbjogICAgICAgICAgICAgICAgMy4wLjM0PGJyLz4gIEJ1Z3JlcG9ydDogICAgICAgICAgICAgICAgICB5YXotaGVscEBpbmRleGRhdGEuZGs8YnIvPiAgU291cmNlIGNvZGUgbG9jYXRpb246ICAgICAgIC48YnIvPiAgQyBQcmVwcm9jZXNzb3I6ICAgICAgICAgICAgIGdjYyAtRTxici8+ICBDIFByZXByb2Nlc3NvciBmbGFnczo8YnIvPiAgQyBDb21waWxlcjogICAgICAgICAgICAgICAgIGdjYzxici8+ICBDIENvbXBpbGVyIGZsYWdzOiAgICAgICAgICAgLWcgLU8yPGJyLz4gIExpbmtlciBmbGFnczo8YnIvPiAgTGlua2VkIGxpYnM6ICAgICAgICAgICAgICAgIC1ML3Vzci9saWIgLWx4c2x0IC1seG1sMiAtbHogLWxwdGhyZWFkIC1sbTxici8+ICBIb3N0IFN5c3RlbSBUeXBlOiAgICAgICAgICAgaTY4Ni1wYy1saW51eC1nbnU8YnIvPiAgSW5zdGFsbCBwYXRoOiAgICAgICAgICAgICAgIC91c3IvbG9jYWw8YnIvPiAgQXV0b21ha2U6ICAgICAgICAgICAgICAgICAgICR7U0hFTEx9PGJyLz4vaG9tZS93ZWJhZG1pbi95YXotMy4wLjM0L2NvbmZpZy9taXNzaW5nIC0tcnVuICAgICBhdXRvbWFrZS0xLjEwPGJyLz4gIEFyY2hpdmVyOiAgICAgICAgICAgICAgICAgICBhcjxici8+ICBSYW5saWI6ICAgICAgICAgICAgICAgICAgICAgcmFubGliPGJyLz48YnIvPkhvd2V2ZXIsIEkgZ290IHRoaXMgZXJyb3IgbWVzc2FnZTo8YnIvPjxici8+Y29uZmlndXJlOiBXQVJOSU5HOiBsaWJFWFNMVCBkZXZlbG9wbWVudCBsaWJyYXJpZXMgbm90IGZvdW5kLiA8YnIvPjxici8+U2luY2UgSSB3YW50IHRvIHRyeSBpbnN0YWxsaW5nIHBhenBhcjIgLS0gYW5kIEkgYmVsaWV2ZSB0aGVzZSBsaWJyYXJpZXMgYXJlPGJyLz5uZWNlc3NhcnkgdG8gZG8gdGhhdCAtLSg/KSwgSSB1bmluc3RhbGxlZCBZYXogdXNpbmcgJnF1b3Q7bWFrZSB1bmluc3RhbGwmcXVvdDssPGJyLz5yZW1vdmVkIHRoZSByZW1haW5pbmcgeWF6IGRpcmVjdG9yaWVzLCBhbmQgdGhlbiB0cmllZCBpbnN0YWxsaW5nPGJyLz5saWJ4c2x0LWRldmVsIGFuZCBsaWJ4bWwyLWRldmVsIHZpYSBycG1zLCBidXQgd2FzIHVuYWJsZSB0by4gbGlieG1sMiBzZWVtZWQ8YnIvPnRvIGluc3RhbGwgY29ycmVjdGx5LCBmb3IgZXhhbXBsZSwgYnV0IGxpYnhtbDItZGV2ZWwgd291bGRuJiMzOTt0IGluc3RhbGwsPGJyLz5jb21wbGFpbmluZyB0aGF0IEkgaGFkbiYjMzk7dCBpbnN0YWxsZWQgbGlieG1sMiAoISk8YnIvPjxici8+V2hlbiBJIHRyaWVkIHRvIHJlaW5zdGFsbCBZYXosIEkgZ290IHRoaXMgZXJyb3IgbWVzc2FnZSBkdXJpbmcgbWFrZTo8YnIvPjxici8+bWFyY2R1bXAubygudGV4dCsweDEyZCk6IEluIGZ1bmN0aW9uIGBtYXJjZHVtcF9yZWFkX3htbCYjMzk7Ojxici8+L2hvbWUvd2ViYWRtaW4veWF6LTMuMC4zNC91dGlsL21hcmNkdW1wLmM6MTEyOiB1bmRlZmluZWQgcmVmZXJlbmNlIHRvPGJyLz5geG1sUmVhZGVyRm9yRmlsZSYjMzk7PGJyLz5jb2xsZWN0MjogbGQgcmV0dXJuZWQgMSBleGl0IHN0YXR1czxici8+bWFrZVsxXTogKioqIFt5YXotbWFyY2R1bXBdIEVycm9yIDE8YnIvPm1ha2VbMV06IExlYXZpbmcgZGlyZWN0b3J5IGAvaG9tZS93ZWJhZG1pbi95YXotMy4wLjM0L3V0aWwmIzM5Ozxici8+bWFrZTogKioqIFthbGwtcmVjdXJzaXZlXSBFcnJvciAxPGJyLz48YnIvPkkgc3VzcGVjdCBzb21ldGhpbmcgZ290IGNvcnJ1cHRlZCwgc28gSSBhbSBnb2luZyB0byByZXNldCB0aGUgVlBTIGFuZCB0cnk8YnIvPmFnYWluLiBEb2VzIGFueW9uZSBoYXZlIGFueSBzdWdnZXN0aW9ucyBhYm91dCB3aGljaCBkZXZlbG9wbWVudCBsaWJyYXJpZXMgdG88YnIvPmluc3RhbGwgYmVmb3JlIGluc3RhbGxpbmcgWWF6LCBpZiB5b3UgcGxhbiB0byBpbnN0YWxsIHBhenBhcjI/IEFyZSB0aGVyZSBhbnk8YnIvPnRhcmJhbGxzIG9mIHRoZSBsaWJyYXJpZXMgYXZhaWxhYmxlIG9ubGluZSByYXRoZXIgdGhhbiBycG1zPyBJJiMzOTttIGFsc28gbm90PGJyLz5jbGVhciBhYm91dCB3aGljaCB2ZXJzaW9ucyBvZiB0aGUgZGV2ZWxvcG1lbnQgbGlicmFyaWVzIHRvIGluc3RhbGwgLS0gdGhlcmU8YnIvPmFyZSBtYW55IGRpZmZlcmVudCB2ZXJzaW9ucyBhdmFpbGFibGUuPGJyLz48YnIvPlRoYW5rcyE8YnIvPjxici8+LSBDaHJpcyBNb3JnYW48YnIvPjxici8+PC9wPg== 2008-07-09T12:57:58Z Re: Problem installing MARC::Record 2.0.0 under perl 5.8.0 by Brad Baxter PHA+RnJvbTogQnJhZCBCYXh0ZXIKCk9uIFR1ZSwgSnVsIDgsIDIwMDggYXQgNDoxMSBQTSwgQ2hyaXN0b3BoZXIgTW9yZ2FuICZsdDttb3JnYW5AYWNtLm9yZyZndDsgd3JvdGU6PGJyLz48YnIvPiZndDsgU28gdGhlcmUgaXMgaG9wZSEgQnV0LCB5ZXMsIEkgc2VlIHRoZSBuZWVkIHRvIGdldCB0byA1LjguMiBhc2FwITxici8+Jmd0Ozxici8+PGJyLz5vciA1LjEwLjAgIDotKTxici8+PGJyLz48L3A+ 2008-07-08T14:00:37Z RE: Problem installing MARC::Record 2.0.0 under perl 5.8.0 by Christopher Morgan PHA+RnJvbTogQ2hyaXN0b3BoZXIgTW9yZ2FuCgpCcmlhbiw8YnIvPjxici8+VGhhbmtzIHZlcnkgbXVjaC4gSSYjMzk7bGwgdHJ5IHRoYXQgdmVyc2lvbi48YnIvPjxici8+LSBDaHJpczxici8+PGJyLz48YnIvPi0tLS0tT3JpZ2luYWwgTWVzc2FnZS0tLS0tPGJyLz5Gcm9tOiBCcnlhbiBCYWxkdXMgW21haWx0bzpicnlhbi5iYWxkdXNAcXVhbGl0eS1ib29rcy5jb21dIDxici8+U2VudDogVHVlc2RheSwgSnVseSAwOCwgMjAwOCAyOjMxIFBNPGJyLz5UbzogQ2hyaXN0b3BoZXIgTW9yZ2FuOyBwZXJsNGxpYkBwZXJsLm9yZzxici8+U3ViamVjdDogUkU6IFByb2JsZW0gaW5zdGFsbGluZyBNQVJDOjpSZWNvcmQgMi4wLjAgdW5kZXIgcGVybCA1LjguMDxici8+PGJyLz4gT24gVHVlc2RheSwgSnVseSAwOCwgMjAwOCAxMjozNSBQTSwgQ2hyaXN0b3BoZXIgTW9yZ2FuIHdyb3RlOjxici8+Jmd0O0kgYW0gaW4gdGhlIHByb2Nlc3Mgb2YgcmVidWlsZGluZyBteSB3ZWIgc2l0ZSBhZnRlciBhIHBoaXNoaW5nIHNpdGUgPGJyLz4mZ3Q7YnJlYWstaW4gKHlpa2VzISkuIFRoZSBzaXRlIGlzIGZpbmUgbm93LCBhbmQgc2VjdXJlLCBidXQgZm9yIHNvbWUgPGJyLz4mZ3Q7cmVhc29uIEkgY2FuJiMzOTt0IGdldCBNQVJDOjpSZWNvcmQtMi4wLjAgdG8gaW5zdGFsbC4gSSBnZXQgYW4gZXJyb3IgPGJyLz4mZ3Q7bWVzc2FnZSBzYXlpbmcgdGhhdCBwZXJsIDUuOC4yIGlzIHJlcXVpcmVkLCBidXQgdGhhdCBJIG9ubHkgaGF2ZSBwZXJsIDxici8+Jmd0OzUuOC4wLiAoQW5kIGluZGVlZCBJIGRvIGhhdmUgcGVybDxici8+NS44LjApIEJ1dCBJJiMzOTttIHByZXR0eSBzdXJlIHRoaXMgdmVyc2lvbiBvZiBNQVJDOjpSZWNvcmQgKmRpZCogaW5zdGFsbCB1bmRlcjxici8+cGVybCA1LjguMCB0aGF0IGxhc3QgdGltZSBJIHRyaWVkLiZsdDs8YnIvPjxici8+TUFSQzo6UmVjb3JkIDEuMzlfMDIgYXBwZWFycyB0byBiZSB0aGUgbGF0ZXN0IHZlcnNpb24gb24gQ1BBTiB0aGF0IHdvdWxkPGJyLz53b3JrIG9uIDUuOC4wLiBNQVJDOjpSZWNvcmQgMi54IGlzIGluY29tcGF0aWJsZSB3aXRoIHByZS01LjguMiB2ZXJzaW9ucyBvZjxici8+UGVybCBkdWUgdG8gVW5pY29kZS1yZWxhdGVkIGNoYW5nZXMuIFRoZSBjaGFuZ2Ugd2FzIGFubm91bmNlZCBpbiBhIFBlcmw0TGliPGJyLz5tZXNzYWdlICZxdW90O01BUkM6OlJlY29yZCB2Mi4wIFJDMSZxdW90Oywgc2VudCBGcmkgNS8yMC8yMDA1IDI6MzUgUE0sIGJ5IEVkIFN1bW1lcnMuPGJyLz5bMV08YnIvPjxici8+WzFdICZsdDtodHRwOi8vd3d3Lm5udHAucGVybC5vcmcvZ3JvdXAvcGVybC5wZXJsNGxpYi8yMDA1LzA1L21zZzIwNzAuaHRtbCZndDs8YnIvPjxici8+SSBob3BlIHRoaXMgaGVscHMsPGJyLz48YnIvPkJyeWFuIEJhbGR1czxici8+YnJ5YW4uYmFsZHVzQHF1YWxpdHktYm9va3MuY29tPGJyLz5laWphYmJAY3Bhbi5vcmc8YnIvPmh0dHA6Ly9ob21lLmlud2F2ZS5jb20vZWlqYTxici8+PGJyLz48L3A+ 2008-07-08T13:26:21Z RE: Problem installing MARC::Record 2.0.0 under perl 5.8.0 by Christopher Morgan PHA+RnJvbTogQ2hyaXN0b3BoZXIgTW9yZ2FuCgomZ3Q7SSBzdXJlIGhvcGUgeW91IG1lYW50IHVwZ3JhZGluZyB0byBQZXJsIDUuOC4yIChvciBoaWdoZXIpIHJhdGhlciB0aGFuPGJyLz5kb3duZ3JhZGluZyB0byA8YnIvPiZndDtNQVJDOjpSZWNvcmQgMS4zOV8wMi4gIDstKTxici8+PGJyLz5NaWNoYWVsLDxici8+PGJyLz5JIHdpc2ggKDotJmd0OykhIFVuZm9ydHVuYXRlbHksIEkmIzM5O20gc3R1Y2sgd2l0aCBQZXJsIDUuOC4wIGJlY2F1c2UgbXkgVlBTPGJyLz4odmlydHVhbCBwcml2YXRlIHNlcnZlcikgYXQgQXBvbGxvIGRvZXNuJiMzOTt0IG9mZmVyIDUuOC4yIC0geWV0LiBPaCB3ZWxsLiBJIGhhZDxici8+YmVlbiBzdWNjZXNmdWxseSB1c2luZyBNQVJDOjpDaGFyc2V0IHdpdGggTUFSQzo6UmVjb3JkIDEuMzlfMiB1cCB1bnRpbCBteTxici8+d2ViIHNpdGUgaW1wbG9zaW9uLCBhbmQgdGhlIFVuaWNvZGUgd2FzIHdvcmtpbmcgZmluZSwgaW5jbHVkaW5nIHRoZSBtYXJjOCAtJmd0Ozxici8+dXRmOCBjb252ZXJzaW9ucyBmcm9tIE1hcmMgcmVjb3JkcyBhbmQgdGhlaXIgc3Vic2VxdWVudCBkaXNwbGF5IGluIGJyb3dzZXJzLjxici8+U28gdGhlcmUgaXMgaG9wZSEgQnV0LCB5ZXMsIEkgc2VlIHRoZSBuZWVkIHRvIGdldCB0byA1LjguMiBhc2FwITxici8+PGJyLz4tIENocmlzPGJyLz48YnIvPjxici8+PC9wPg== 2008-07-08T13:14:33Z RE: Problem installing MARC::Record 2.0.0 under perl 5.8.0 by Christopher Morgan PHA+RnJvbTogQ2hyaXN0b3BoZXIgTW9yZ2FuCgpCcnlhbiw8YnIvPjxici8+UHJvYmxlbSBzb2x2ZWQuIEF0IHlvdXIgc3VnZ2VzdGlvbiwgSSBpbnN0YWxsZWQgTUFSQzo6UmVjb3JkIDEuMzlfMDIsIGFuZDxici8+aXQmIzM5O3Mgd29ya2luZyBmaW5lLiA8YnIvPjxici8+SSB3YXMgd3JvbmcgYWJvdXQgaGF2aW5nIGluc3RhbGxlZCB2ZXJzaW9uIDIuMCBpbiB0aGUgcGFzdC4gV2hlbiBJIGxvb2tlZCBhdDxici8+bXkgb2xkIG5vdGVzLCBJIHNhdyB0aGF0IEkgd2FzbiYjMzk7dCBhYmxlIHRvIGluc3RhbGwgdmVyc2lvbiAyLjAgYmVjYXVzZSBvZjxici8+UGVybCA1LjguMC4gLSB3aGljaCBtYWtlcyBzZW5zZS4gVGhhdCYjMzk7cyB3aGF0IEkgZ2V0IGZvciBsZWF2aW5nIGEgdGFyYmFsbCBvZjxici8+TWFyYzo6UmVjb3JkIDIuMCBpbiBteSBiYWNrdXAgZGlyZWN0b3J5ICg6LSZndDspPGJyLz48YnIvPlRoYW5rcyBhZ2FpbiBmb3IgeW91ciBoZWxwITxici8+PGJyLz4tIENocmlzPGJyLz48YnIvPjwvcD4= 2008-07-08T12:52:20Z RE: Problem installing MARC::Record 2.0.0 under perl 5.8.0 by Doran, Michael D

From: Doran, Michael D Hi Chris,

> I'll try that version.

I sure hope you meant upgrading to Perl 5.8.2 (or higher) rather than downgrading to MARC::Record 1.39_02. ;-)

This is just my un-asked for 2 cents, but I wouldn't stint on anything that will make the processing of Unicode-encoded text easier. Last December seemed to mark a tipping point for Unicode, both on the internet:

"Just last December [2007] there was an interesting milestone
on the web. For the first time, we found that Unicode was the
most frequent encoding found on web pages, overtaking both
ASCII and Western European encodings" [1]

...as well as for its use in MARC records:

"To facilitate the movement of records between MARC-8 and Unicode
environments, it was recommended for an initial period that the use of
Unicode be restricted to a repertoire identical in extent to the MARC-8
repertoire. [...] however, such a restriction is no longer appropriate.
The full UCS repertoire, as currently defined at the Unicode web site,
is valid for encoding MARC 21 records subject only to the constraints
described [in the current MARC 21 Specifications]." [2]

-- Michael

[1] The Official Google Blog: "Moving to Unicode 5.1"
http://googleblog.blogspot.com/2008/05/moving-to-unicode-51.html

[2] MARC 21 Specifications: Unicode Encoding Environment
(revised December 2007)
http://www.loc.gov/marc/specifications/speccharucs.html

# Michael Doran, Systems Librarian
# University of Texas at Arlington
# 817-272-5326 office
# 817-688-1926 mobile
# doran@uta.edu
# http://rocky.uta.edu/doran/


> -----Original Message-----
> From: Christopher Morgan [mailto:morgan@acm.org]
> Sent: Tuesday, July 08, 2008 2:12 PM
> To: 'Bryan Baldus'; perl4lib@perl.org
> Subject: RE: Problem installing MARC::Record 2.0.0 under perl 5.8.0
>
> Brian,
>
> Thanks very much. I'll try that version.
>
> - Chris
>
>
> -----Original Message-----
> From: Bryan Baldus [mailto:bryan.baldus@quality-books.com]
> Sent: Tuesday, July 08, 2008 2:31 PM
> To: Christopher Morgan; perl4lib@perl.org
> Subject: RE: Problem installing MARC::Record 2.0.0 under perl 5.8.0
>
> On Tuesday, July 08, 2008 12:35 PM, Christopher Morgan wrote:
> >I am in the process of rebuilding my web site after a phishing site
> >break-in (yikes!). The site is fine now, and secure, but for some
> >reason I can't get MARC::Record-2.0.0 to install. I get an error
> >message saying that perl 5.8.2 is required, but that I only have perl
> >5.8.0. (And indeed I do have perl
> 5.8.0) But I'm pretty sure this version of MARC::Record *did* install
> under
> perl 5.8.0 that last time I tried.<
>
> MARC::Record 1.39_02 appears to be the latest version on CPAN that would
> work on 5.8.0. MARC::Record 2.x is incompatible with pre-5.8.2 versions of
> Perl due to Unicode-related changes. The change was announced in a
> Perl4Lib
> message "MARC::Record v2.0 RC1", sent Fri 5/20/2005 2:35 PM, by Ed
> Summers.
> [1]
>
> [1] <http://www.nntp.perl.org/group/perl.perl4lib/2005/05/msg2070.html>
>
> I hope this helps,
>
> Bryan Baldus
> bryan.baldus@quality-books.com
> eijabb@cpan.org
> http://home.inwave.com/eija

2008-07-08T12:49:48Z
RE: Problem installing MARC::Record 2.0.0 under perl 5.8.0 by Christopher Morgan

From: Christopher Morgan Brian,

Thanks very much. I'll try that version.

- Chris


-----Original Message-----
From: Bryan Baldus [mailto:bryan.baldus@quality-books.com]
Sent: Tuesday, July 08, 2008 2:31 PM
To: Christopher Morgan; perl4lib@perl.org
Subject: RE: Problem installing MARC::Record 2.0.0 under perl 5.8.0

On Tuesday, July 08, 2008 12:35 PM, Christopher Morgan wrote:
>I am in the process of rebuilding my web site after a phishing site
>break-in (yikes!). The site is fine now, and secure, but for some
>reason I can't get MARC::Record-2.0.0 to install. I get an error
>message saying that perl 5.8.2 is required, but that I only have perl
>5.8.0. (And indeed I do have perl
5.8.0) But I'm pretty sure this version of MARC::Record *did* install under
perl 5.8.0 that last time I tried.<

MARC::Record 1.39_02 appears to be the latest version on CPAN that would
work on 5.8.0. MARC::Record 2.x is incompatible with pre-5.8.2 versions of
Perl due to Unicode-related changes. The change was announced in a Perl4Lib
message "MARC::Record v2.0 RC1", sent Fri 5/20/2005 2:35 PM, by Ed Summers.
[1]

[1] <http://www.nntp.perl.org/group/perl.perl4lib/2005/05/msg2070.html>

I hope this helps,

Bryan Baldus
bryan.baldus@quality-books.com
eijabb@cpan.org
http://home.inwave.com/eija

2008-07-08T12:20:00Z
RE: Problem installing MARC::Record 2.0.0 under perl 5.8.0 by Bryan Baldus

From: Bryan Baldus On Tuesday, July 08, 2008 12:35 PM, Christopher Morgan wrote:
>I am in the process of rebuilding my web site after a phishing site break-in (yikes!). The site is fine now, and secure, but for some reason I can't get MARC::Record-2.0.0 to install. I get an error message saying that perl 5.8.2 is required, but that I only have perl 5.8.0. (And indeed I do have perl
5.8.0) But I'm pretty sure this version of MARC::Record *did* install under perl 5.8.0 that last time I tried.<

MARC::Record 1.39_02 appears to be the latest version on CPAN that would work on 5.8.0. MARC::Record 2.x is incompatible with pre-5.8.2 versions of Perl due to Unicode-related changes. The change was announced in a Perl4Lib message "MARC::Record v2.0 RC1", sent Fri 5/20/2005 2:35 PM, by Ed Summers. [1]

[1] <http://www.nntp.perl.org/group/perl.perl4lib/2005/05/msg2070.html>

I hope this helps,

Bryan Baldus
bryan.baldus@quality-books.com
eijabb@cpan.org
http://home.inwave.com/eija

2008-07-08T11:35:31Z
Problem installing MARC::Record 2.0.0 under perl 5.8.0 by Christopher Morgan

From: Christopher Morgan I am in the process of rebuilding my web site after a phishing site break-in
(yikes!). The site is fine now, and secure, but for some reason I can't get
MARC::Record-2.0.0 to install. I get an error message saying that perl 5.8.2
is required, but that I only have perl 5.8.0. (And indeed I do have perl
5.8.0) But I'm pretty sure this version of MARC::Record *did* install under
perl 5.8.0 that last time I tried.

I cheated by changing line 2 in the Makefile.PL file to read "require
perl-5.8.0" instead of "5.8.2". It installed, but it only passed about 20%
of the tests during make test. Am I asking for trouble here? Will it work,
or should I try installing an earlier version? (If so, which earlier
version, and where should I get it?) Also, I saw a patch somewhere that you
could use if you're installing into systems that use Perl 5.00xxx or earlier
(or something to that effect).

Any thoughts from anyone on this?

Many thanks!

- Chris

2008-07-08T10:37:42Z
Cleaning MARC file by Emmanuel Di Pretoro

From: Emmanuel Di Pretoro Hi,

Is there anybody who is already involved in the process of cleaning a MARC
file. This means:
- fusion multiple records into one single record;
- or keep one record, and delete the others.

Can you describe your methodology, as well as used algorithms.

Thanks in advance.

Regards,

Emmanuel Di Pretoro

2008-06-26T02:04:10Z
The Berman Catalog by md

From: md I have the raw data files of the former Hennepin County Library
catalog and authority files.

This is the innovative, unique catalog created
by Sandy Berman. 1970s-2002.

I would like to import the data into a MYSQL database. I assume
this can be done with Perl, but don't know if an existing parser
would work or if a custom program would be needed.

I have no programming skills. There must be someone...
here who knows and values Berman's work and is ready,
willing and able to devote their knowledge
and skills to making it accessible once again.

Please contact me with questions on or off list.

Thank You!

Madeline Douglass
mdougla@pclink.com

http://www.sanfordberman.org


2008-06-22T17:56:19Z
RE: Practicality of using DB_File on a Perl-based book site? by Christopher Morgan

From: Christopher Morgan Harrison,

That's useful information. Yes, I'll only be doing lookups, which simplifies
things quite a bit. Given what you said about the Movable Type software, I
assume DB_FILE would be a good way to keep track of website user names,
passwords, cookies, and the like?

- Chris

-----Original Message-----
From: vagrantscholar@gmail.com [mailto:vagrantscholar@gmail.com] On Behalf
Of Harrison Dekker
Sent: Friday, June 20, 2008 2:19 PM
To: Christopher Morgan
Subject: Re: Practicality of using DB_File on a Perl-based book site?

Chris,

I'm no expert, but it seems to me, that there should be less overhead using
Berkeley DB compared to a relational DB, assuming that all you're doing is
lookups. If you've got a bunch of post processing going on involving
multiple large retrieval sets then you'll probably lose that edge, but
that's only because your perl code would be doing the work that a more
optimized SQL engine could be doing. SQL doesn't give you any improvement,
however, when all you're doing is a key/value type lookup.

Movable Type blog software uses BDB, at least it did in the past, and as far
as I know it's quite reliable/scalable. I use BDB for one web servicey type
application and I do have to throttle my requests if I'm sending them in
batch, but the db isn't the bottleneck, it's apache or the php xml functions
I use.

-Harrison

On Fri, Jun 20, 2008 at 10:42 AM, Christopher Morgan <morgan@acm.org> wrote:
>
> I'm designing a web site that will display MARC authority files
> onscreen. I use a Perl hash that's tied to a (read-only) Berkeley
> DB_file, and it works nicely. How practical is this approach if
> there's going to be moderate traffic on a site?
>
> My DB_FILE is about 200MB, but of course Perl brings only small pieces
> of the database into memory at any one time. Would the site bog down
> if people were accessing records at the rate of, say, every few
> seconds? Should I consider mySQL instead? I'd prefer to stick to
> DB_FILE, since it's so easy and elegant -- and I can easily create complex
data structures.
>
> What if one of my data files was significantly bigger (say, a GB or
> two of MARC book records)? I don't have a feel for the pros and cons
> of the various approaches to accessing large databases using Perl, but
> tied hashes are pretty fast! In any case, I know I'll have to lock the
> file during each read, via "flock" or the like. I haven't tried
implementing the latter yet.
>
> Does anyone have any ideas about this? Are there other Perl forums I
> should investigate regarding this topic?
>
> Many thanks!
>
> - Chris Morgan
>
>



--
Harrison Dekker -- Coordinator of Data Services -- UC Berkeley Libraries
510-642-8095 :: GTalk:vagrantscholar :: AIM:hdekker :: Meebo:ucbdekker
http://sunsite.berkeley.edu/wikis/datalab/
------------------------
Q: Why is this email 5 sentences or less?
A: http://five.sentenc.es

2008-06-20T11:40:08Z
Practicality of using DB_File on a Perl-based book site? by Christopher Morgan

From: Christopher Morgan
I'm designing a web site that will display MARC authority files onscreen. I
use a Perl hash that's tied to a (read-only) Berkeley DB_file, and it works
nicely. How practical is this approach if there's going to be moderate
traffic on a site?

My DB_FILE is about 200MB, but of course Perl brings only small pieces of
the database into memory at any one time. Would the site bog down if people
were accessing records at the rate of, say, every few seconds? Should I
consider mySQL instead? I'd prefer to stick to DB_FILE, since it's so easy
and elegant -- and I can easily create complex data structures.

What if one of my data files was significantly bigger (say, a GB or two of
MARC book records)? I don't have a feel for the pros and cons of the various
approaches to accessing large databases using Perl, but tied hashes are
pretty fast! In any case, I know I'll have to lock the file during each
read, via "flock" or the like. I haven't tried implementing the latter yet.

Does anyone have any ideas about this? Are there other Perl forums I should
investigate regarding this topic?

Many thanks!

- Chris Morgan

2008-06-20T10:43:19Z
Re: Can't parse MARC Authority XML files with mx: prefixes in their tags by Mike Rylander

From: Mike Rylander On Wed, Jun 18, 2008 at 1:12 PM, Christopher Morgan <morgan@acm.org> wrote:
> Mike,
>
> I tried both of your suggested fixes (changing Name to LocalName, and
> running the updated patch), but no luck. I still get no error messages in
> the error log, but the program silently fails to print a report. If I
> manually remove the "mx:" namespace strings from all the tags, I can process
> the files with no problem. (So one quick fix would be to simply run these
> records through a quick search and replace routine.)
>
> Regarding the problem name authority files. They're all available on the web
> from OCLC's experimental name authority service, at
> http://alcme.oclc.org/eprintsUK/index.html
>
> You enter an author name (I entered "Robert Benchley"). Then I clicked on
> the first link at http://errol.oclc.org/laf/n50-7168.html
>
> Finally, I clicked on the second link ("XML Record") to get this link:
> http://errol.oclc.org/laf/n50-7168.MarcXML All of these have the "mx:"
> namespace notation in their tags.

Thanks. I will see if I can fix this on my installation, but since
the LocalName (only) change did not work for you I have suspicions
about the particular XML parser that's being chosen for the SAX part
on your system. The pure-perl parser (in some versions) did not
support namespaces well, and expat can be quirky as well.

I'll let you know what I find, and thanks for testing.

--
Mike Rylander
| VP, Research and Design
| Equinox Software, Inc. / The Evergreen Experts
| phone: 1-877-OPEN-ILS (673-6457)
| email: miker@esilibrary.com
| web: http://www.esilibrary.com

2008-06-18T11:56:02Z
RE: Can't parse MARC Authority XML files with mx: prefixes in theirtags by Christopher Morgan

From: Christopher Morgan Mike,

I tried both of your suggested fixes (changing Name to LocalName, and
running the updated patch), but no luck. I still get no error messages in
the error log, but the program silently fails to print a report. If I
manually remove the "mx:" namespace strings from all the tags, I can process
the files with no problem. (So one quick fix would be to simply run these
records through a quick search and replace routine.)

Regarding the problem name authority files. They're all available on the web
from OCLC's experimental name authority service, at
http://alcme.oclc.org/eprintsUK/index.html

You enter an author name (I entered "Robert Benchley"). Then I clicked on
the first link at http://errol.oclc.org/laf/n50-7168.html

Finally, I clicked on the second link ("XML Record") to get this link:
http://errol.oclc.org/laf/n50-7168.MarcXML All of these have the "mx:"
namespace notation in their tags.

- Chris

2008-06-18T10:13:03Z
Re: Can't parse MARC Authority XML files with mx: prefixes in their tags by Mike Rylander

From: Mike Rylander On Tue, Jun 10, 2008 at 1:18 PM, Christopher Morgan <morgan@acm.org> wrote:
> Mike,
>
> Sorry. Since my last post, I did find out how to use the UNIX patch command,
> and applied your patch to SAX.pm. My script still doesn't work, and there
> are no error messages. My earlier script (which worked on the subject
> authority file) now does not work, so I'm wondering if something in the
> patch may be causing this. I have a backup of the SAX.pm file in any case.
>

Well, it turns out I left something out of the patch I sent before.
In the end_element sub, the second line should be

my $name = $element->{ LocalName };

instead of

my $name = $element->{ Name };

If you would, you can just edit the installed version of the patched
SAX.pm to test.

The next thing to try would be to remove the namespace test, but leave
the LocalName changes in place. Anecdotal evidence suggests that some
of the more popular XML parsing engines, or at least the Perl bindings
for them, have problems with namespaces. I've attached a (complete,
arg!) patch that implements just the LocalName changes and would be
applied to the original version of SAX.pm.

If you don't have time to test all this that's fined, but if not would
you be willing to send a couple of your problem records?

Thanks Christopher,

--
Mike Rylander
| VP, Research and Design
| Equinox Software, Inc. / The Evergreen Experts
| phone: 1-877-OPEN-ILS (673-6457)
| email: miker@esilibrary.com
| web: http://www.esilibrary.com

2008-06-12T06:54:22Z
RE: Can't parse MARC Authority XML files with mx: prefixes in theirtags by Christopher Morgan

From: Christopher Morgan Mike,

Many thanks. My apologies, but I've never applied a Perl patch before, so
I'm not sure of the correct procedure. I did locate the SAX.pm file.

- Chris

-----Original Message-----
From: Mike Rylander [mailto:mrylander@gmail.com]
Sent: Tuesday, June 10, 2008 11:57 AM
To: Christopher Morgan
Cc: jtgorman@uiuc.edu; perl4lib@perl.org
Subject: Re: Can't parse MARC Authority XML files with mx: prefixes in their
tags

On Mon, Jun 9, 2008 at 5:39 PM, Christopher Morgan <morgan@acm.org> wrote:
> Jonathan,
>
> Many thanks. I get no errors on the command line or in the error log
> when I run the script. The file just executes with no output. If you
> have the time to run it, I've included the scriupt below, and have
> attached the name authority record it tries to process:

The problem is that the SAX parser is looking for the element Name instead
of LocalName. I've attached a patch that tests both LocalName and
NamespaceURI. If you could apply this to your version of MARC/File/SAX.pm
and give it a test, and it works for you, I'll commit it to the CVS repo.

--miker

2008-06-10T10:51:40Z
RE: Can't parse MARC Authority XML files with mx: prefixes in theirtags by Christopher Morgan

From: Christopher Morgan Mike,

Sorry. Since my last post, I did find out how to use the UNIX patch command,
and applied your patch to SAX.pm. My script still doesn't work, and there
are no error messages. My earlier script (which worked on the subject
authority file) now does not work, so I'm wondering if something in the
patch may be causing this. I have a backup of the SAX.pm file in any case.

- Chris


2008-06-10T10:20:17Z
Re: Can't parse MARC Authority XML files with mx: prefixes in their tags by Mike Rylander

From: Mike Rylander On Mon, Jun 9, 2008 at 5:39 PM, Christopher Morgan <morgan@acm.org> wrote:
> Jonathan,
>
> Many thanks. I get no errors on the command line or in the error log when I
> run the script. The file just executes with no output. If you have the time
> to run it, I've included the scriupt below, and have attached the name
> authority record it tries to process:

The problem is that the SAX parser is looking for the element Name
instead of LocalName. I've attached a patch that tests both LocalName
and NamespaceURI. If you could apply this to your version of
MARC/File/SAX.pm and give it a test, and it works for you, I'll commit
it to the CVS repo.

--miker

>
> #! /usr/bin/perl
> use strict;
>
> use MARC::Record;
> use MARC::Batch;
> use MARC::File::XML;
> use constant MAX => 20;
>
> MARC::File::XML->default_record_format('UNIMARCAUTH');
> my $batch = MARC::Batch->new( 'XML', 'name_authority_file');
> while (my $record = $batch->next()) {
> for my $field ($record->field("100")){
> my $name= $field->subfield('a');
> print "$name", "\n";
> }
> }
>
> I think you're right about the LOC files -- they probably got the extra
> spaces by accident. That's easy enough to fix.
>
> As far as the name authorities go, if I can't get MARC::File::XML to process
> them, I can always use XML::Tokeparser. Not as elegant, but it would get the
> job done.
>
> - Chris
>
> -----Original Message-----
> From: Jonathan Gorman [mailto:jtgorman@uiuc.edu]
> Sent: Monday, June 09, 2008 4:43 PM
> To: Christopher Morgan; perl4lib@perl.org
> Subject: Re: Can't parse MARC Authority XML files with mx: prefixes in their
> tags
>
>
>
>>However, I'm having trouble parsing the name authority records online
>>at http://alcme.oclc.org/eprintsUK/index.html
>
> [snipped code examples]
>>
>>There are "mx:" prefixes in all the tags. What format is this? Is there
>>any way I can get MARC::File::XML to parse these files?
>
> The prefixes are the namespace. The parser should be able to handle this,
> but I don't honestly know if it does it correctly. What also might be the
> problem is the second namespace in there. It might help us if you included
> some information about what is not working (what error are you getting etc).
> I don't have the time right now to run my own test, but actual error
> messages might provide some clue.
>
>>A related question: When I first tried to process the subject authority
>>files from the LOC (in my first example, above), the program complained
>>that the "Leader must be 24 bytes long".
>
> Right, that comes from the MARC specification, there are 24 bytes.
>
>>XML files are five years old. I wonder if the XML spec has changed
>>since
>>then?)
>
> Doubt it, again it doesn't have anything really to do with the XML spec but
> the underlying xml record. More likely it is some error in creating the
> files. Can't give any more info though, sorry.
>
> Jon Gorman
>



--
Mike Rylander
| VP, Research and Design
| Equinox Software, Inc. / The Evergreen Experts
| phone: 1-877-OPEN-ILS (673-6457)
| email: miker@esilibrary.com
| web: http://www.esilibrary.com

2008-06-10T08:57:28Z
RE: Can't parse MARC Authority XML files with mx: prefixes in theirtags by Christopher Morgan

From: Christopher Morgan Jonathan,

Many thanks. I get no errors on the command line or in the error log when I
run the script. The file just executes with no output. If you have the time
to run it, I've included the scriupt below, and have attached the name
authority record it tries to process:

#! /usr/bin/perl
use strict;

use MARC::Record;
use MARC::Batch;
use MARC::File::XML;
use constant MAX => 20;

MARC::File::XML->default_record_format('UNIMARCAUTH');
my $batch = MARC::Batch->new( 'XML', 'name_authority_file');
while (my $record = $batch->next()) {
for my $field ($record->field("100")){
my $name= $field->subfield('a');
print "$name", "\n";
}
}

I think you're right about the LOC files -- they probably got the extra
spaces by accident. That's easy enough to fix.

As far as the name authorities go, if I can't get MARC::File::XML to process
them, I can always use XML::Tokeparser. Not as elegant, but it would get the
job done.

- Chris

-----Original Message-----
From: Jonathan Gorman [mailto:jtgorman@uiuc.edu]
Sent: Monday, June 09, 2008 4:43 PM
To: Christopher Morgan; perl4lib@perl.org
Subject: Re: Can't parse MARC Authority XML files with mx: prefixes in their
tags



>However, I'm having trouble parsing the name authority records online
>at http://alcme.oclc.org/eprintsUK/index.html

[snipped code examples]
>
>There are "mx:" prefixes in all the tags. What format is this? Is there
>any way I can get MARC::File::XML to parse these files?

The prefixes are the namespace. The parser should be able to handle this,
but I don't honestly know if it does it correctly. What also might be the
problem is the second namespace in there. It might help us if you included
some information about what is not working (what error are you getting etc).
I don't have the time right now to run my own test, but actual error
messages might provide some clue.

>A related question: When I first tried to process the subject authority
>files from the LOC (in my first example, above), the program complained
>that the "Leader must be 24 bytes long".

Right, that comes from the MARC specification, there are 24 bytes.

>XML files are five years old. I wonder if the XML spec has changed
>since
>then?)

Doubt it, again it doesn't have anything really to do with the XML spec but
the underlying xml record. More likely it is some error in creating the
files. Can't give any more info though, sorry.

Jon Gorman

2008-06-09T14:39:34Z
Re: Can't parse MARC Authority XML fileswith mx: prefixes in their tags by Jonathan Gorman

From: Jonathan Gorman

>However, I'm having trouble parsing the name authority records online at
>http://alcme.oclc.org/eprintsUK/index.html

[snipped code examples]
>
>There are "mx:" prefixes in all the tags. What format is this? Is there any
>way I can get MARC::File::XML to parse these files?

The prefixes are the namespace. The parser should be able to handle this, but I don't honestly know if it does it correctly. What also might be the problem is the second namespace in there. It might help us if you included some information about what is not working (what error are you getting etc). I don't have the time right now to run my own test, but actual error messages might provide some clue.

>A related question: When I first tried to process the subject authority
>files from the LOC (in my first example, above), the program complained that
>the "Leader must be 24 bytes long".

Right, that comes from the MARC specification, there are 24 bytes.

>XML files are five years old. I wonder if the XML spec has changed since
>then?)

Doubt it, again it doesn't have anything really to do with the XML spec but the underlying xml record. More likely it is some error in creating the files. Can't give any more info though, sorry.

Jon Gorman

2008-06-09T13:43:35Z
Can't parse MARC Authority XML files with mx: prefixes in their tags by Christopher Morgan

From: Christopher Morgan I have been successfully using MARC::File::XML to process MARC subject
authority files from the LOC, such as this sample record:



<?xml version="1.0" encoding="UTF-8" ?>

<collection xmlns="http://www.loc.gov/MARC21"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.loc.gov/MARC21
http://www.loc.gov/standards/marcxml/schema/MARC21.xsd">



<record type="Bibliographic">

<leader>00495cz 2200169n 4500</leader>

<controlfield tag="001">sh 00000014 </controlfield>

<controlfield tag="003">DLC </controlfield>

<controlfield tag="005">20000508151507.0 </controlfield>

<controlfield tag="008">000321i| anannbabn |a ana
</controlfield>

<datafield tag="010" ind1="" ind2="">

<subfield code="a">sh 00000014 </subfield>

</datafield>

<datafield tag="040" ind1="" ind2="">

<subfield code="a">DLC</subfield>

<subfield code="b">eng</subfield>

<subfield code="c">DLC </subfield>

</datafield>

<datafield tag="150" ind1="" ind2="">

<subfield code="a">Tacos </subfield>

</datafield>

</record>



The following script prints subfield "a" of tag 150:



MARC::File::XML->default_record_format('UNIMARCAUTH');

my $batch = MARC::Batch->new( 'XML', '../filename');

while (my $record = $batch->next()) {

for my $field ($record->field("150")){

my $name= $field->subfield('a');

print "$name", "\n";

}

}



However, I'm having trouble parsing the name authority records online at
http://alcme.oclc.org/eprintsUK/index.html



Here is part of one of these records (from
<http://errol.oclc.org/laf/n50-7168.MarcXML>
http://errol.oclc.org/laf/n50-7168.MarcXML):



<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>

<mx:record xmlns:mx="http://www.loc.gov/MARC21/slim"
xmlns=http://www.w3.org/TR/xhtml1/strict

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">



<mx:leader>00000cz 2200000n 0000</mx:leader>

<mx:controlfield tag="001">oca00042708</mx:controlfield>

. . . . .

. . . . .

etc.



There are "mx:" prefixes in all the tags. What format is this? Is there any
way I can get MARC::File::XML to parse these files?



A related question: When I first tried to process the subject authority
files from the LOC (in my first example, above), the program complained that
the "Leader must be 24 bytes long". All the leader tags in the authority
files I got from the LOC have five trailing blank spaces at the end. I
manually removed the spaces to get the test files to work. I can always
preprocess the files to take out the trailing spaces, but I wonder if
there's a way around this with MARC::File::XML. (These LOC subject authority
XML files are five years old. I wonder if the XML spec has changed since
then?)



Many thanks for any help!



- Chris Morgan


2008-06-09T13:33:18Z
Job Posting: Integrated Library Systems Librarian VCU Libraries by Jimmy Ghaphery <p>From: Jimmy Ghaphery Greetings,<br/><br/>Apologies in advance for duplicate cross-postings.<br/><br/>POSITION: Integrated Library Systems Librarian<br/>REPORTS TO: Head, Library Information Systems<br/><br/>SUMMARY: The VCU Libraries invites applications and nominations for the <br/>position of Integrated Library Systems Librarian responsible for <br/>providing exemplary discovery of our collections and increasing <br/>efficiencies for all aspects of technical and public services <br/>processing. In addition to being the lead agent for the integrated <br/>library system (Aleph 500), this position will manage related systems <br/>for statistics (ARC) and electronic resource management (Verde). The <br/>position offers maximum opportunity for professional growth and impact <br/>for a talented and creative individual. The successful candidate will <br/>join a culturally and academically diverse faculty of the highest caliber.<br/><br/>RESPONSIBILITIES: Manages major enterprise library applications, <br/>including Aleph 500, ARC, and Verde. Ensures availability of the <br/>Integrated Library System for two locations (VCU Richmond and VCU Qatar) <br/>and takes a lead role investigating next generation interfaces. Liaisons <br/>with the University Computer Center&rsquo;s Database and Systems <br/>Administration Team. Works with other units of the library on system <br/>development and troubleshooting. Plans ongoing customization, <br/>maintenance, and periodic upgrades for Aleph 500. Continues development <br/>of ARC and serves as a lead in fielding Verde. Anticipates the need for <br/>design and programming enhancements to fulfill the information and <br/>research needs of the University community. Meets with Library <br/>Information Systems Managers on a regular basis to set priorities for <br/>the department. The Integrated Library Systems Librarian is expected to <br/>be active professionally and to contribute to developments in the field. <br/>Faculty with the VCU Libraries are evaluated, and promoted, on the basis <br/>of job performance, scholarship, and professional development and service.<br/><br/>QUALIFICATIONS:<br/>Required: ALA-accredited graduate degree or accredited graduate degree <br/>in Information Systems or other related disciplines.<br/>Preferred: Experience implementing and/or supporting enterprise <br/>applications. Working knowledge of software applications needed to <br/>manage the Aleph environment, including UNIX command structure and <br/>system administration utilities, Apache, Perl, Oracle SQL, SFTP, XML, <br/>etc. Experience in a research library with knowledge of public and <br/>technical services operations as well as library standards (MARC, <br/>Z39.50, OAI-PMH). Proven ability to manage multiple projects and <br/>assignments concurrently and effectively. Strong analytical, <br/>troubleshooting, and problem solving skills. Excellent oral and written <br/>communication skills and the ability to interact professionally with a <br/>diverse group of clients and staff. Ability to work successfully with <br/>external vendor support and documentation. Availability and willingness <br/>to work a flexible schedule, including some nights, weekends, and <br/>holidays. Experience in training, customer support, and/or writing <br/>technical documentation. A passion and talent for creating easily <br/>understood systems that offer transparent yet complex functionality. <br/>Experience working in a culturally diverse environment highly preferred.<br/><br/>SALARY: Commensurate with qualifications, but not less than $48,000 <br/>annually. This is a full-time, non-tenured faculty position. Normal <br/>faculty benefits apply, including 24 vacation days annually and choice <br/>of retirement and annuity plans. For more information about benefits, <br/>see http://www.hr.vcu.edu/benefits/.<br/><br/>A complete job posting is available at <br/>http://www.library.vcu.edu/admin/jobs/<br/><br/>For more information about the VCU Libraries, please visit our home page <br/>at http://www.library.vcu.edu/. Review of applications will begin July <br/>15, 2008, and will continue until the position is filled. Submit cover <br/>letter, resume, and the names, addresses, and telephone numbers of three <br/>references to:<br/>Kathleen McColgan<br/>Personnel Administrator<br/>VCU Libraries, Virginia Commonwealth University<br/>901 Park Avenue<br/>PO Box 842033<br/>Richmond, VA 23284-2033<br/>804-828-2730<br/>804-828-0151 (fax)<br/>mccolgankh@vcu.edu<br/><br/>Virginia Commonwealth University is an Equal Opportunity/Affirmative <br/>Action employer. Women, minorities, and persons with disabilities are <br/>encouraged to apply.<br/><br/>-- <br/>Jimmy Ghaphery<br/>Head, Library Information Systems<br/>VCU Libraries<br/>http://www.library.vcu.edu<br/>--<br/></p> 2008-06-09T06:51:12Z Re: Ignoring deleted records in Biblio::ISIS by Saiful Amin

From: Saiful Amin #!/usr/bin/perl
#
# Name: ccf2marc.pl
# Author: Saiful Amin <saiful@edutech.com>
# Date: May 2008
# Version: 0.4
# Description: Takes the CDS/ISIS as input and gives valid MARC21 data as output.
#

use strict;
use warnings;
#use diagnostics;
use Biblio::Isis;
use MARC::Record;

# Usage Instructions
die "\nUSAGE: $0 Output_file\n" unless defined $ARGV[0];

# Open the ISIS Database
my $isis = new Biblio::Isis (
isisdb => 'C:/WINISIS/DATA/sample/',
#include_deleted => 0,
debug => 0
);

open (OUTFILE, ">$ARGV[0]");

my $num = 0;

###################################################

for (my $mfn = 1; $mfn <= $isis->count; $mfn++) {
my $marc = MARC::Record->new();
my $record = $isis->to_hash($mfn);

my $first_author_a = $$record{300}[0]{a};
my $first_author_b = $$record{300}[0]{b};
my $first_author_e = $$record{300}[0]{f};

my $corp_author_a = $$record{310}[0]{a};
my $corp_author_b = $$record{310}[0]{b};
my $corp_author_c = $$record{310}[0]{d};
my $corp_author_cc = $$record{310}[0]{e};

my $field_245a = $$record{200}[0]{a};
my $field_245c = $$record{200}[0]{b};

my $conf_author_a = $$record{320}[0]{a};
my $conf_author_cc = $$record{320}[0]{e};
my $conf_author_c = $$record{320}[0]{g};
my $conf_author_d = $$record{320}[0]{h};
my $conf_author_n = $$record{320}[0]{j};

$num++;

###################################################
# Create the Leader and map with relevant codes
###################################################

# Prepare the Leader
my $leader = '00054nam#a22002891a 4500';
$marc->leader($leader);

###################################################
# Create the fixed field tags (007/008)
###################################################

my $data_008 = '080528| r| ||111|eng||||';
my $tag_008 = MARC::Field->new('008', $data_008);
$marc->append_fields($tag_008);

#######################################################
# Create the first author (Main Entry/Added Entry)
#######################################################

# First author in tag_100
if ($first_author_a) {
my $first_author = "$first_author_a";
$first_author .= ", $first_author_b" if $first_author_b;
my $main_author = '';

if ($corp_author_a || $conf_author_a) {
$main_author = MARC::Field->new('700', 1,'', 'a' => '');
} else {
$main_author = MARC::Field->new('100', 1,'', 'a' => '');
}
$main_author->update('a' => $first_author);
$main_author->update('e' => $first_author_e) if $first_author_e;
$marc->append_fields($main_author);
}

#######################
## The Title Section ##
#######################

my $title = $field_245a;
my $state_of_resp = '';

if ($field_245c) {
$state_of_resp = $field_245c;
}

#print "$mfn\n" if !defined $title;
$title = "No title found" if !defined $title;

# Create Title field
my $tag_245 = MARC::Field->new('245',1,0,
'a' => "$title"
);
$tag_245->update('c' => $state_of_resp) if $state_of_resp;
$marc->append_fields($tag_245);

###################################################
# Write output to OUTFILE
###################################################

print OUTFILE $marc->as_usmarc();
print STDOUT "Printed record number $mfn\n";
}
close (OUTFILE);

2008-05-28T10:43:35Z
Re: Ignoring deleted records in Biblio::ISIS by Dobrica Pavlinusic

From: Dobrica Pavlinusic On Tue, May 27, 2008 at 01:03:22PM +0530, Saiful Amin wrote:
> Hi Dobrica,
>
> Thanks for quick reply.
>
> Could you please send me small sample of CDS/ISIS deleted records to take a
> > look?
>
>
> You can download the sample database of 50 records, in which MFN 18 and 19
> are logically deleted, from the following link:
> http://122.166.0.252/sample.zip

I can't reproduce your problem. When I try to dump your records using
dump_isisdb.pl included in Bibio::ISIS distribution (with options to start
at record 17, and dump just 4 records) I get:

$ ./scripts/dump_isisdb.pl -o 17 -l 4 data/sample/BOOKS. | grep ^0
0 17
0 20

which means that by default it dumped just record 17 and 20 skipping 18
and 19. If I add option -v which turn include_deleted on I get:

/Biblio-Isis$ ./scripts/dump_isisdb.pl -o 17 -l 4 -v data/sample/BOOKS. | grep ^0
0 17
0 18
0 19
0 20

as I would expect. Adding -d also shows that Bibio::ISIS correctly find
that MFN 18 and 19 are logically deleted.

I would love to help you with this, but I'm puzzled.

--
Dobrica Pavlinusic 2share!2flame dpavlin@rot13.org
Unix addict. Internet consultant. http://www.rot13.org/~dpavlin

2008-05-28T08:02:47Z
Re: Ignoring deleted records in Biblio::ISIS by Saiful Amin

From: Saiful Amin Hi Dobrica,

Thanks for quick reply.

Could you please send me small sample of CDS/ISIS deleted records to take a
> look?


You can download the sample database of 50 records, in which MFN 18 and 19
are logically deleted, from the following link:
http://122.166.0.252/sample.zip

Currently, I'm using the clumsy methods to purge these records (as suggested
by a CDS/ISIS user): export the records, re-initialize the database, and
import the records back. It would be nice if we can just ignore the
logically deleted records.

Thanks again.

Regards,
Saiful

2008-05-27T00:33:29Z
Re: Ignoring deleted records in Biblio::ISIS by Dobrica Pavlinusic

From: Dobrica Pavlinusic On Tue, May 27, 2008 at 11:10:40AM +0530, Saiful Amin wrote:
> Hi,
>
> I'm doing a crosswalk of CCF records (stored in CDS/ISIS) into MARC21 to
> import them into a modern ILS. I'm using Biblio::ISIS and MARC::Record for
> this purpose.
>
> If I understand correctly, CDS/ISIS only logically deletes a record and
> doesn't delete it permanently. Biblio::ISIS is not ignoring those logically
> deleted records. I've tried setting the 'include_deleted' ("Don't skip
> logically deleted records in ISIS") to 0, but it doesn't work.
>
> Any ideas?

Could you please send me small sample of CDS/ISIS deleted records to take a
look?

> I want to take this opportunity to thank authors of both the modules
> (Dobrica Pavlinusic and Andy Lester) for writing such amazing modules. I've
> been using them with great results for few years now.

You are welcomed. While we are at it, I must say that you are one of few
users of Bibio::ISIS that I know of :-)

--
Dobrica Pavlinusic 2share!2flame dpavlin@rot13.org
Unix addict. Internet consultant. http://www.rot13.org/~dpavlin

2008-05-26T23:51:49Z
Ignoring deleted records in Biblio::ISIS by Saiful Amin

From: Saiful Amin Hi,

I'm doing a crosswalk of CCF records (stored in CDS/ISIS) into MARC21 to
import them into a modern ILS. I'm using Biblio::ISIS and MARC::Record for
this purpose.

If I understand correctly, CDS/ISIS only logically deletes a record and
doesn't delete it permanently. Biblio::ISIS is not ignoring those logically
deleted records. I've tried setting the 'include_deleted' ("Don't skip
logically deleted records in ISIS") to 0, but it doesn't work.

Any ideas?

I want to take this opportunity to thank authors of both the modules
(Dobrica Pavlinusic and Andy Lester) for writing such amazing modules. I've
been using them with great results for few years now.

Best regards,
Saiful

--
Saiful Amin
Project Lead

Edutech India Pvt Ltd
Bangalore, India.
+91 9343826438

2008-05-26T22:40:47Z
MARC Errorchecks and Lint Module updates by Bryan Baldus

From: Bryan Baldus I have updated MARC::Errorchecks in CPAN, releasing version 1.14, and
have updated MARC::Lint in CVS on SourceForge. Changes for each are
listed below.

MARC::Errorchecks changes:

Version 1.14: Updated Oct. 21, 2007, Jan. 21, 2008, May 20, 2008.
Released May 25, 2008.

-Updated %ldrbytes with leader/19 per Update no. 8, Oct. 2007. Check
for validity of leader/19 not yet implemented.
-Updated _check_book_bytes with code '2' ('Offprints') for
008/24-27, per Update no. 8, Oct. 2007.
-Updated check_245ind1vs1xx($record) with TODO item and comments
-Updated check_bk008_vs_300($record) to allow "leaves of plates" (as
opposed to "leaves", when no p. or v. is present), "leaf", and
"column"(s).
-Updated test in Errorchecks.t to remove check for LCCN starting
with year greater than the current year. This was at 2008, which is
no longer later. A test may be implemented in the future that will be
less likely to break with the passage of time.


MARC::Lint changes:

- Updated _check_article with the exception 'A to '
- Updated Lint::DATA section with Update No. 8 (Oct. 2007)


############

Please let me know of any problems, suggestions, etc.

Thank you,

Bryan Baldus
bryan.baldus@quality-books.com
eijabb@cpan.org
http://home.inwave.com/eija

2008-05-25T13:26:50Z
identifying non-Latin scripts using MARC::Record by Shieh, Jackie <p>From: Shieh, Jackie <br/>I was just wondering whether anyone has needed to identify non-Latin scripts in MARC utf-8 records? How that was done using MARC::Record module? Some sample scripts below in the file. Thank you. <br/> <br/>1) Korean <br/>100 1 _aChin, Chae-hy&Aring;&#143;k <br/>400 1 _aChin, Peter J. <br/>400 1 _a&igrave;&sect;&#132; &igrave;&#158;&not;&iacute;&#152;&#129; <br/> <br/>2) Hebrew <br/>100 1 _aBialik, Hayyim Nahman, <br/> _d1873-1934 <br/>400 1 _a&times;&#145;&times;&#153;&times;&#144;&times;&#156;&times;&#153;&times;&sect;, &times;&#151;&times;&#153;&times;&#153;&times;&#157; &times;&nbsp;&times;&#151;&times;&#158;&times;&#159;, <br/> _d1873-1934 <br/> <br/>3) Japanese <br/>110 2 _aDifensu Ris&Auml;&#129;chi Sent&Auml;&#129; <br/>410 2 _a&atilde;&#131;&#135;&atilde;&#130;&pound;&atilde;&#131;&#149;&atilde;&#130;&sect;&atilde;&#131;&sup3;&atilde;&#130;&sup1;&atilde;&#131;&ordf;&atilde;&#130;&micro;&atilde;&#131;&frac14;&atilde;&#131;&#129;&atilde;&#130;&raquo;&atilde;&#131;&sup3;&atilde;&#130;&iquest;&atilde;&#131;&frac14; <br/>410 2 _aDefense Research Center <br/>410 2 _aDRC <br/>410 2 _aJapan Defense Research Center <br/> <br/>4) Chinese simplified <br/>100 1 _aShen, Congwen, <br/> _d1902- <br/>400 1 _a&aelig;&sup2;&#136;&auml;&raquo;&#142;&aelig;&#150;&#135;, <br/> _d1902- <br/>400 1 _wnne <br/> _aShen, Ts&Ecirc;&raquo;ung-wen, <br/> _d1902- <br/>400 1 _aShen, Zengwen, <br/> _d1902- <br/> <br/>5) Arabic &amp; Cyrillic <br/> <br/>100 1 _aArnaud, F.T.M. de B. d&#39; <br/> _q(Fran&Atilde;&sect;ois Thomas Marie de Baculard d&#39;), <br/> _d1718-1805 <br/>400 1 _a&lt;U+202A&gt;&Oslash;&sect;&Oslash;&plusmn;&Ugrave;&#134;&Oslash;&macr;&Oslash;&#140; &Ugrave;&#129;.&Oslash;&ordf;.&Ugrave;&#133;. &Oslash;&macr; &Oslash;&uml;. &Oslash;&macr;&#39; <br/> _q(&Ugrave;&#129;&Oslash;&plusmn;&Ugrave;&#134;&Oslash;&sup3; &Oslash;&laquo;&Ugrave;&#133;&Oslash;&sup3; &Ugrave;&#133;&Oslash;&plusmn; &Oslash;&macr; &Oslash;&uml;&Ugrave;&#132;&Oslash;&plusmn;&Oslash;&macr; &Oslash;&macr;&#39;)&Oslash;&#140; <br/> _d1718-1805 <br/>400 1 _a&ETH;&#144;&Ntilde;&#128;&ETH;&frac12;&ETH;&frac34;, &ETH;&curren;.&ETH;&cent;.&ETH;&#156;. &ETH;&acute;&ETH;&micro; &ETH;&#145;. &ETH;&acute;&#39; <br/> _q(&ETH;&curren;&Ntilde;&#128;&ETH;&deg;&ETH;&frac12;&Ntilde;&#129;&Ntilde;&#131;&ETH;&deg; &ETH;&cent;&ETH;&frac34;&ETH;&frac14;&ETH;&deg; &ETH;&#156;&ETH;&deg;&Ntilde;&#128;&ETH;&cedil; &ETH;&acute;&ETH;&micro; &ETH;&#145;&ETH;&deg;&ETH;&ordm;&Ntilde;&#142;&ETH;&raquo;&ETH;&deg;&Ntilde;&#128; &ETH;&acute;&#39;), <br/> <br/> <br/>Regards, <br/> <br/>--Jackie <br/> <br/>|Jackie Shieh <br/>|Data Loads &amp; Development <br/>|Harlan Hatcher Graduate Library <br/>|University of Michigan <br/>|920 North University <br/>|Ann Arbor, MI 48109-1205 <br/>|Phone: 734.763.6070 FAX: 734.615.9788 <br/>|E-mail: JShieh [AT] umich [DOT] edu <br/> <br/></p> 2008-05-15T09:11:18Z Re: [PATCH] Support III "extended" characters in MARC::Charset by Mike Rylander PHA+RnJvbTogTWlrZSBSeWxhbmRlcgoKT24gTW9uLCBNYXkgNSwgMjAwOCBhdCAxMDowMCBQTSwgTWlrZSBSeWxhbmRlciAmbHQ7bXJ5bGFuZGVyQGdtYWlsLmNvbSZndDsgd3JvdGU6PGJyLz4mZ3Q7IE9uIFdlZCwgQXByIDE2LCAyMDA4IGF0IDQ6MjggUE0sIEdhbGVuIENoYXJsdG9uPGJyLz4mZ3Q7ICAmbHQ7Z2FsZW4uY2hhcmx0b25AbGlibGltZS5jb20mZ3Q7IHdyb3RlOjxici8+Jmd0OyAgJmd0OyAgSGksPGJyLz4mZ3Q7ICAmZ3Q7PGJyLz4mZ3Q7ICAmZ3Q7ICBUaGUgYXR0YWNoZWQgcGF0Y2ggYWRkcyBtYXBwaW5ncyBmb3IgdmFyaW91cyBub24tc3RhbmRhcmQgKHBlciBNQVJDLTgpPGJyLz4mZ3Q7ICAmZ3Q7ICBjaGFyYWN0ZXJzIHRoYXQgSUlJIE1pbGxlbm5pdW0gaGFzIGJlZW4gb2JzZXJ2ZWQgdG8gc3RpY2sgaW50byB0aGU8YnIvPiZndDsgICZndDsgIEVBQ0MgcmFuZ2UuPGJyLz4mZ3Q7PGJyLz4mZ3Q7ICBUaGFua3MgR2FsZW4uICBJIGp1c3QgaGFwcGVuZWQgdG8gcG9wIG92ZXIgaGVyZSBvbiBhIHdoaW0gYW5kIG5vdGljZWQ8YnIvPiZndDsgIHlvdXIgcGF0Y2guICBJJiMzOTtsbCBpbmNvcnBvcmF0ZSBpdCBhcyBzb29uIGFzIEkgaGF2ZSBhIHNwYXJlIG1vbWVudC48YnIvPjxici8+SSBmb3VuZCBhIG1vbWVudCBqdXN0IHNpdHRpbmcgdGhlcmUgb24gdGhlIGdyb3VuZCwgc28gSSBndWVzcyBpdCB3YXM8YnIvPnNwYXJlIC4uLiBQYXRjaCBhcHBsaWVkIHdpdGggdGhlIGNvbW11bml0eSYjMzk7cyB0aGFua3MhPGJyLz48YnIvPi0tbWlrZXI8YnIvPjxici8+Jmd0Ozxici8+Jmd0OyAgLS1taWtlcjxici8+Jmd0Ozxici8+Jmd0Ozxici8+Jmd0Ozxici8+Jmd0OyAgJmd0Ozxici8+Jmd0OyAgJmd0OyAgVGhlIGZvbGxvd2luZyBjaGFyYWN0ZXJzIGFyZSBub3cgaGFuZGxlZCAtIGxpc3RlZCBhcmUgJnF1b3Q7RUFDQyZxdW90Ozxici8+Jmd0OyAgJmd0OyAgY29kZXBvaW50LCBjaGFyYWN0ZXIgbmFtZSwgYW5kIHRoZSBVQ1MgbWFwcGluZy48YnIvPiZndDsgICZndDs8YnIvPiZndDsgICZndDsgIDIxMjAzZCAtIEhPUklaT05UQUwgRUxMSVBTSVMgLSBVKzIwMjY8YnIvPiZndDsgICZndDsgIDIxMjA0MCAtIExFRlQgRE9VQkxFIFFVT1RBVElPTiBNQVJLIC0gVSsyMDFDPGJyLz4mZ3Q7ICAmZ3Q7ICA3ZjIwMTQgLSBFTSBEQVNIIC0gVSsyMDE0PGJyLz4mZ3Q7ICAmZ3Q7ICA3ZjIwMTkgLSBSSUdIVCBTSU5HTEUgUVVPVEFUSU9OIE1BUksgLSBVKzIwMTk8YnIvPiZndDsgICZndDsgIDdmMjAyMCAtIFJJR0hUIERPVUJMRSBRVU9UQVRJT04gTUFSSyAtIFUrMjAxRDxici8+Jmd0OyAgJmd0OyAgN2YyMTIyIC0gVFJBREUgTUFSSyBTSUdOIC0gVSsyMTIyPGJyLz4mZ3Q7ICAmZ3Q7PGJyLz4mZ3Q7ICAmZ3Q7ICBJIHN1c3BlY3QgdGhlcmUgYXJlIG1vcmUgb2YgdGhlc2U7IGlmIGFueWJvZHkgaGFzIGFueSBhZGRpdGlvbmFsPGJyLz4mZ3Q7ICAmZ3Q7ICBjaGFyYWN0ZXIgbWFwcGluZ3MgdG8gc2hhcmUgb3IgY2FuIHBvaW50IG1lIHRvIGEgY29tcGxldGUgcHVibGljIGxpc3Q8YnIvPiZndDsgICZndDsgIG9mIHRoZW0sIEkgd291bGQgZ3JlYXRseSBhcHByZWNpYXRlIGl0Ljxici8+Jmd0OyAgJmd0Ozxici8+Jmd0OyAgJmd0OyAgUmVnYXJkcyw8YnIvPiZndDsgICZndDs8YnIvPiZndDsgICZndDsgIEdhbGVuPGJyLz4mZ3Q7ICAmZ3Q7ICAtLTxici8+Jmd0OyAgJmd0OyAgR2FsZW4gQ2hhcmx0b248YnIvPiZndDsgICZndDsgIEtvaGEgQXBwbGljYXRpb24gRGV2ZWxvcGVyPGJyLz4mZ3Q7ICAmZ3Q7ICBMaWJMaW1lPGJyLz4mZ3Q7ICAmZ3Q7ICBnYWxlbi5jaGFybHRvbkBsaWJsaW1lLmNvbTxici8+Jmd0OyAgJmd0OyAgcDogMS04ODgtNTY0LTI0NTcgeDcwOTxici8+Jmd0OyAgJmd0Ozxici8+Jmd0Ozxici8+Jmd0Ozxici8+Jmd0Ozxici8+Jmd0OyAgLS08YnIvPiZndDsgIE1pa2UgUnlsYW5kZXI8YnIvPiZndDsgICB8IFZQLCBSZXNlYXJjaCBhbmQgRGVzaWduPGJyLz4mZ3Q7ICAgfCBFcXVpbm94IFNvZnR3YXJlLCBJbmMuIC8gVGhlIEV2ZXJncmVlbiBFeHBlcnRzPGJyLz4mZ3Q7ICAgfCBwaG9uZTogMS04NzctT1BFTi1JTFMgKDY3My02NDU3KTxici8+Jmd0OyAgIHwgZW1haWw6IG1pa2VyQGVzaWxpYnJhcnkuY29tPGJyLz4mZ3Q7ICAgfCB3ZWI6IGh0dHA6Ly93d3cuZXNpbGlicmFyeS5jb208YnIvPiZndDs8YnIvPjxici8+PGJyLz48YnIvPi0tIDxici8+TWlrZSBSeWxhbmRlcjxici8+IHwgVlAsIFJlc2VhcmNoIGFuZCBEZXNpZ248YnIvPiB8IEVxdWlub3ggU29mdHdhcmUsIEluYy4gLyBUaGUgRXZlcmdyZWVuIEV4cGVydHM8YnIvPiB8IHBob25lOiAxLTg3Ny1PUEVOLUlMUyAoNjczLTY0NTcpPGJyLz4gfCBlbWFpbDogbWlrZXJAZXNpbGlicmFyeS5jb208YnIvPiB8IHdlYjogaHR0cDovL3d3dy5lc2lsaWJyYXJ5LmNvbTxici8+PC9wPg== 2008-05-13T20:18:14Z Re: Stripping out Unicode combining characters (diacritics) - by Brad Baxter PHA+RnJvbTogQnJhZCBCYXh0ZXIKCkp1c3QgdG8gdGhyb3cgdGhpcyBvdXQgdGhlcmU6IHlvdSBtYXkgYmUgaW50ZXJlc3RlZCBpbiBUZXh0OjpVbmlkZWNvZGU8YnIvPihodHRwOi8vc2VhcmNoLmNwYW4ub3JnL35zYnVya2UvVGV4dC1VbmlkZWNvZGUtMC4wNC8pIGlmIHlvdXIgdWx0aW1hdGU8YnIvPmdvYWwgaXMgdG8gdHJ5IHRvIHJlcHJlc2VudCBhIHVuaWNvZGUgY2hhcmFjdGVyIHdpdGggaXRzIGNsb3Nlc3QgYXNjaWk8YnIvPihvciBwZXJoYXBzIEkgc2hvdWxkIHNheSwgJnF1b3Q7cm9tYW5pemVkJnF1b3Q7KSBlcXVpdmFsZW50Ljxici8+PGJyLz4tLSBCcmFkPGJyLz48YnIvPk9uIFdlZCwgTWF5IDcsIDIwMDggYXQgOTo1MSBBTSwgRG9yYW4sIE1pY2hhZWwgRCAmbHQ7ZG9yYW5AdXRhLmVkdSZndDsgd3JvdGU6PGJyLz48YnIvPiZndDsgSSByZWNlaXZlZCBhIG51bWJlciBvZiBoZWxwZnVsIHN1Z2dlc3Rpb25zIGFuZCBzb2x1dGlvbnMuICBUaGUgYXBwcm9hY2ggSTxici8+Jmd0OyBkZWNpZGVkIHRvIGFkb3B0IGluIG15IGxhcmdlciBzY3JpcHQgaXMgdG8gJiMzOTtkZWNvZGUmIzM5OyBhbGwgdGhlIGluY29taW5nIGZvcm08YnIvPiZndDsgaW5wdXQgYXMgVVRGLTggYXMgd2VsbCBhcyB0aGUgaW5wdXQgZnJvbSB0aGUgZGF0YWJhc2UgdGhhdCBJJiMzOTtsbCBiZSBtYXRjaGluZzxici8+Jmd0OyB0aGUgZm9ybSBpbnB1dCBhZ2FpbnN0LiAgVGhpcyBzZWVtcyB0byBhbGxvdyB0aGUgJiMzOTtccHtNfSYjMzk7IHN5bnRheCB0byB3b3JrIGFzPGJyLz4mZ3Q7IGV4cGVjdGVkIGluIGEgUGVybCByZWdleHAuICBJbiBteSB0ZXN0LmNnaSBzY3JpcHQgZm9yIGZvcm0gaW5wdXQgaXQgd291bGQ8YnIvPiZndDsgbGlrZSBsaWtlIHRoaXM6PGJyLz4mZ3Q7PGJyLz4mZ3Q7ICMhL3Vzci9sb2NhbC9iaW4vcGVybDxici8+Jmd0OyB1c2Ugc3RyaWN0Ozxici8+Jmd0OyB1c2UgQ0dJOzxici8+Jmd0OyB1c2UgRW5jb2RlOzxici8+Jmd0OyBteSAkcXVlcnkgPSBDR0k6Om5ldygpOzxici8+Jmd0OyBteSAkc2VhcmNoX3Rlcm0gPSBkZWNvZGUoJiMzOTtVVEYtOCYjMzk7LCRxdWVyeS0mZ3Q7cGFyYW0oJiMzOTt0ZXh0JiMzOTspKTs8YnIvPiZndDsgbXkgJHNhbnNfZGlhY3JpdGljcyAgPSAkc2VhcmNoX3Rlcm07PGJyLz4mZ3Q7ICRzYW5zX2RpYWNyaXRpY3MgPX4gcy9ccE0qLy9nOzxici8+Jmd0OyBwcmludCBxcShDb250ZW50LXR5cGU6IHRleHQvcGxhaW47IGNoYXJzZXQ9dXRmLTg8YnIvPiZndDs8YnIvPiZndDsgc2VhcmNoX3Rlcm0gICAgIGlzICRzZWFyY2hfdGVybTxici8+Jmd0OyBzYW5zX2RpYWNyaXRpY3MgaXMgJHNhbnNfZGlhY3JpdGljczxici8+Jmd0OyApOzxici8+Jmd0OyBleGl0KDApOzxici8+Jmd0Ozxici8+Jmd0OyBJJiMzOTttIHNsb3dseSBmaWd1cmluZyBvdXQgaG93IHRvIHdvcmsgd2l0aCBVbmljb2RlIGluIG15IHdlYiBzY3JpcHRzLCBidXQ8YnIvPiZndDsgc3RpbGwgaGF2ZSBhIGxvdCB0byBsZWFybi4gIFRoYW5rcyBmb3IgYWxsIHRoZSBoZWxwLiA6LSk8YnIvPiZndDs8YnIvPiZndDsgLS0gTWljaGFlbDxici8+Jmd0Ozxici8+Jmd0OyAjIE1pY2hhZWwgRG9yYW4sIFN5c3RlbXMgTGlicmFyaWFuPGJyLz4mZ3Q7ICMgVW5pdmVyc2l0eSBvZiBUZXhhcyBhdCBBcmxpbmd0b248YnIvPiZndDsgIyA4MTctMjcyLTUzMjYgb2ZmaWNlPGJyLz4mZ3Q7ICMgODE3LTY4OC0xOTI2IG1vYmlsZTxici8+Jmd0OyAjIGRvcmFuQHV0YS5lZHU8YnIvPiZndDsgIyBodHRwOi8vcm9ja3kudXRhLmVkdS9kb3Jhbi88YnIvPiZndDs8YnIvPiZndDs8YnIvPiZndDsgJmd0OyAtLS0tLU9yaWdpbmFsIE1lc3NhZ2UtLS0tLTxici8+Jmd0OyAmZ3Q7IEZyb206IERvcmFuLCBNaWNoYWVsIEQgW21haWx0bzpkb3JhbkB1dGEuZWR1XTxici8+Jmd0OyAmZ3Q7IFNlbnQ6IE1vbmRheSwgTWF5IDA1LCAyMDA4IDc6MjcgUE08YnIvPiZndDsgJmd0OyBUbzogcGVybC1pMThuQHBlcmwub3JnPGJyLz4mZ3Q7ICZndDsgQ2M6IFBlcmw0bGliPGJyLz4mZ3Q7ICZndDsgU3ViamVjdDogU3RyaXBwaW5nIG91dCBVbmljb2RlIGNvbWJpbmluZyBjaGFyYWN0ZXJzIChkaWFjcml0aWNzKTxici8+Jmd0OyAmZ3Q7PGJyLz4mZ3Q7ICZndDsgSSYjMzk7bSB0cnlpbmcgdG8gc3RyaXAgb3V0IGNvbWJpbmluZyBkaWFjcml0aWNzIGZyb20gc29tZSBmb3JtPGJyLz4mZ3Q7ICZndDsgaW5wdXQgdXNpbmcgdGhpcyBjb2RlOjxici8+Jmd0OyAmZ3Q7PGJyLz4mZ3Q7ICZndDsgJmx0O2hlYWQmZ3Q7PGJyLz4mZ3Q7ICZndDsgICAgICZsdDtNRVRBIGh0dHAtZXF1aXY9JnF1b3Q7Q29udGVudC1UeXBlJnF1b3Q7IGNvbnRlbnQ9JnF1b3Q7dGV4dC9odG1sOzxici8+Jmd0OyAmZ3Q7IGNoYXJzZXQ9VVRGLTgmcXVvdDsmZ3Q7ICZsdDsvaGVhZCZndDsgJmx0O2JvZHkmZ3Q7PGJyLz4mZ3Q7ICZndDsgICAmbHQ7Zm9ybSBhY3Rpb249JnF1b3Q7dGVzdC5jZ2kmcXVvdDsgYWNjZXB0LWNoYXJzZXQ9JnF1b3Q7VVRGLTgmcXVvdDsgbWV0aG9kPSZxdW90O2dldCZxdW90OyZndDs8YnIvPiZndDsgJmd0OyAgICAgJmx0O2lucHV0IHR5cGU9JnF1b3Q7dGV4dCZxdW90OyBuYW1lPSZxdW90O3RleHQmcXVvdDsgdmFsdWU9JnF1b3Q7JnF1b3Q7IHNpemU9JnF1b3Q7MTAmcXVvdDsmZ3Q7PGJyLz4mZ3Q7ICZndDsgICAgICZsdDtpbnB1dCB0eXBlPSZxdW90O3N1Ym1pdCZxdW90OyB2YWx1ZT0mcXVvdDtzdWJtaXQmcXVvdDsmZ3Q7PGJyLz4mZ3Q7ICZndDsgICAmbHQ7L2Zvcm0mZ3Q7PGJyLz4mZ3Q7ICZndDsgJmx0Oy9ib2R5Jmd0Ozxici8+Jmd0OyAmZ3Q7ICZsdDsvaHRtbCZndDs8YnIvPiZndDsgJmd0Ozxici8+Jmd0OyAmZ3Q7ICMhL3Vzci9sb2NhbC9iaW4vcGVybDxici8+Jmd0OyAmZ3Q7IHVzZSBDR0k7PGJyLz4mZ3Q7ICZndDsgJHF1ZXJ5ID0gQ0dJOjpuZXcoKTs8YnIvPiZndDsgJmd0OyAkc2VhcmNoX3Rlcm0gPSAkcXVlcnktJmd0O3BhcmFtKCYjMzk7dGV4dCYjMzk7KTs8YnIvPiZndDsgJmd0OyAkc2Fuc19kaWFjcml0aWNzICA9ICRzZWFyY2hfdGVybTs8YnIvPiZndDsgJmd0OyAkc2Fuc19kaWFjcml0aWNzICA9fiBzL1xwe019Ki8vZzs8YnIvPiZndDsgJmd0OyAjJHNhbnNfZGlhY3JpdGljcyAgPX4gcy9vLy9nOzxici8+Jmd0OyAmZ3Q7IHByaW50IHFxKENvbnRlbnQtdHlwZTogdGV4dC9wbGFpbjsgY2hhcnNldD11dGYtODxici8+Jmd0OyAmZ3Q7PGJyLz4mZ3Q7ICZndDsgJHNhbnNfZGlhY3JpdGljczxici8+Jmd0OyAmZ3Q7ICk7PGJyLz4mZ3Q7ICZndDsgZXhpdCgwKTs8YnIvPiZndDsgJmd0Ozxici8+Jmd0OyAmZ3Q7PGJyLz4mZ3Q7ICZndDsgSW4gdGhlIGZvcm0sIEkmIzM5O20gaW5wdXR0aW5nIHRoZSBzdHJpbmcgJnF1b3Q7QmFydG8mI3gzMDE7ayZxdW90OyB3aXRoIHRoZTxici8+Jmd0OyAmZ3Q7IGFjY2VudGVkIGNoYXJhY3RlciBiZWluZyBhIGJhc2UgY2hhcmFjdGVyIChzbWFsbCBMYXRpbiBsZXR0ZXI8YnIvPiZndDsgJmd0OyAmcXVvdDtvJnF1b3Q7KSBmb2xsb3dlZCBieSBhIGNvbWJpbmluZyBhY3V0ZSBhY2NlbnQuICBIb3dldmVyLCB3aGVuIEk8YnIvPiZndDsgJmd0OyBwcmludCAodG8gdGhlIHdlYikgJHNhbnNfZGlhY3JpdGljcywgSSBnZXQgbXkgaW5wdXQgd2l0aCBubzxici8+Jmd0OyAmZ3Q7IGNoYW5nZSAtLSB0aGUgY29tYmluaW5nIGRpYWNyaXRpYyBpcyBzdGlsbCB0aGVyZS4gIEkga25vdzxici8+Jmd0OyAmZ3Q7IHRoYXQgbXkgaW5wdXQgaXMgbm90IGEgcHJlY29tcG9zZWQgYWNjZW50ZWQgY2hhcmFjdGVyLDxici8+Jmd0OyAmZ3Q7IGJlY2F1c2UgSSBjYW4gc3RyaXAgb3V0IHRoZSBiYXNlICZxdW90O28mcXVvdDsgYW5kIHRoZSBjb21iaW5pbmcgYWNjZW50PGJyLz4mZ3Q7ICZndDsgZWl0aGVyIHN0YW5kcyBhbG9uZSBvciBqdW1wcyB0byBhbm90aGVyIGNoYXJhY3RlciBbMl0uPGJyLz4mZ3Q7ICZndDs8YnIvPiZndDsgJmd0OyBUaGUgJnF1b3Q7XHB7TX0mcXVvdDsgaXMgYSBVbmljb2RlIGNsYXNzIG5hbWUgZm9yIHRoZSBjaGFyYWN0ZXIgY2xhc3M8YnIvPiZndDsgJmd0OyBvZiBVbmljb2RlICYjMzk7bWFya3MmIzM5OywgZm9yIGV4YW1wbGUgYWNjZW50IG1hcmtzIFsxXS4gIEkmIzM5O3ZlIHRyaWVkPGJyLz4mZ3Q7ICZndDsgdGhlc2UgdmFyaWF0aW9ucyAoYW5kIG1hbnkgb3RoZXJzKSBhbmQgbm9uZSBzZWVtIHRvIGJlIGRvaW5nPGJyLz4mZ3Q7ICZndDsgd2hhdCBJIHdhbnQ6PGJyLz4mZ3Q7ICZndDs8YnIvPiZndDsgJmd0OyAgICAgICAgJHNhbnNfZGlhY3JpdGljcyA9fiBzI1tccHtNYXJrfV0qIyNnOzxici8+Jmd0OyAmZ3Q7ICAgICAgICAkc2Fuc19kaWFjcml0aWNzID1+IHRyI1tccHtJbkNvbWJpbmluZ0RpYWNyaXRpY2FsTWFya3N9XSMjOzxici8+Jmd0OyAmZ3Q7ICAgICAgICAkc2Fuc19kaWFjcml0aWNzID1+IHRyI1tccHtNfV0jIzs8YnIvPiZndDsgJmd0OyAgICAgICAgJHNhbnNfZGlhY3JpdGljcyA9fiBzL1xwe019Ki8vZzs8YnIvPiZndDsgJmd0OyAgICAgICAgJHNhbnNfZGlhY3JpdGljcyA9fiBzI1tccHtNfV0jI2c7PGJyLz4mZ3Q7ICZndDsgICAgICAgICRzYW5zX2RpYWNyaXRpY3MgPX4gcyNceHswMzAxfSMjZzs8YnIvPiZndDsgJmd0OyAgICAgICAgJHNhbnNfZGlhY3JpdGljcyA9fiBzI1x4ezAwNkZ9XHh7MDMwMX0jI2c7PGJyLz4mZ3Q7ICZndDsgICAgICAgICRzYW5zX2RpYWNyaXRpY3MgPX4gcyNbXHh7MDMwMH0tXHh7MDM2Rn1dKiMjZzs8YnIvPiZndDsgJmd0Ozxici8+Jmd0OyAmZ3Q7IEkmIzM5O20gcHVsbGluZyBteSBoYWlyIG91dCBvbiB0aGlzLi4uIHNvIGFueSBoZWxwIHdvdWxkIGJlPGJyLz4mZ3Q7ICZndDsgYXBwcmVjaWF0ZWQuICBJZiB0aGVyZSYjMzk7cyBhbnkgb3RoZXIgaW5mbyBJIGNhbiBwcm92aWRlLCBsZXQgbWUga25vdy48YnIvPiZndDsgJmd0Ozxici8+Jmd0OyAmZ3Q7IE15IFBlcmwgdmVyc2lvbiBpcyA1LjguOCBhbmQgdGhlIHNjcmlwdCBpcyBydW5uaW5nIG9uIGE8YnIvPiZndDsgJmd0OyBzZXJ2ZXIgcnVubmluZyBTb2xhcmlzIDkuPGJyLz4mZ3Q7ICZndDs8YnIvPiZndDsgJmd0OyAtLSBNaWNoYWVsPGJyLz4mZ3Q7ICZndDs8YnIvPiZndDsgJmd0OyBbMV0gcGVyIGh0dHA6Ly9wZXJsZG9jLnBlcmwub3JnL3BlcmxyZXR1dC5odG1sIGFuZCBvdGhlciBkb2N1bWVudGF0aW9uPGJyLz4mZ3Q7ICZndDs8YnIvPiZndDsgJmd0OyBbMl0gdXNpbmcgJHNhbnNfZGlhY3JpdGljcyAgPX4gcy9vLy9nOzxici8+Jmd0OyAmZ3Q7PGJyLz4mZ3Q7ICZndDsgIyBNaWNoYWVsIERvcmFuLCBTeXN0ZW1zIExpYnJhcmlhbjxici8+Jmd0OyAmZ3Q7ICMgVW5pdmVyc2l0eSBvZiBUZXhhcyBhdCBBcmxpbmd0b248YnIvPiZndDsgJmd0OyAjIDgxNy0yNzItNTMyNiBvZmZpY2U8YnIvPiZndDsgJmd0OyAjIDgxNy02ODgtMTkyNiBtb2JpbGU8YnIvPiZndDsgJmd0OyAjIGRvcmFuQHV0YS5lZHU8YnIvPiZndDsgJmd0OyAjIGh0dHA6Ly9yb2NreS51dGEuZWR1L2RvcmFuLzxici8+Jmd0OyAmZ3Q7PGJyLz4mZ3Q7PGJyLz48L3A+ 2008-05-07T12:20:19Z RE: Stripping out Unicode combining characters (diacritics) - by Doran, Michael D <p>From: Doran, Michael D I received a number of helpful suggestions and solutions. The approach I decided to adopt in my larger script is to &#39;decode&#39; all the incoming form input as UTF-8 as well as the input from the database that I&#39;ll be matching the form input against. This seems to allow the &#39;\p{M}&#39; syntax to work as expected in a Perl regexp. In my test.cgi script for form input it would like like this: <br/> <br/>#!/usr/local/bin/perl <br/>use strict; <br/>use CGI; <br/>use Encode; <br/>my $query = CGI::new(); <br/>my $search_term = decode(&#39;UTF-8&#39;,$query-&gt;param(&#39;text&#39;)); <br/>my $sans_diacritics = $search_term; <br/>$sans_diacritics =~ s/\pM*//g; <br/>print qq(Content-type: text/plain; charset=utf-8 <br/> <br/>search_term is $search_term <br/>sans_diacritics is $sans_diacritics <br/>); <br/>exit(0); <br/> <br/>I&#39;m slowly figuring out how to work with Unicode in my web scripts, but still have a lot to learn. Thanks for all the help. :-) <br/> <br/>-- Michael <br/> <br/># Michael Doran, Systems Librarian <br/># University of Texas at Arlington <br/># 817-272-5326 office <br/># 817-688-1926 mobile <br/># doran@uta.edu <br/># http://rocky.uta.edu/doran/ <br/> <br/> <br/>&gt; -----Original Message----- <br/>&gt; From: Doran, Michael D [mailto:doran@uta.edu] <br/>&gt; Sent: Monday, May 05, 2008 7:27 PM <br/>&gt; To: perl-i18n@perl.org <br/>&gt; Cc: Perl4lib <br/>&gt; Subject: Stripping out Unicode combining characters (diacritics) <br/>&gt; <br/>&gt; I&#39;m trying to strip out combining diacritics from some form <br/>&gt; input using this code: <br/>&gt; <br/>&gt; &lt;head&gt; <br/>&gt; &lt;META http-equiv=&quot;Content-Type&quot; content=&quot;text/html; <br/>&gt; charset=UTF-8&quot;&gt; &lt;/head&gt; &lt;body&gt; <br/>&gt; &lt;form action=&quot;test.cgi&quot; accept-charset=&quot;UTF-8&quot; method=&quot;get&quot;&gt; <br/>&gt; &lt;input type=&quot;text&quot; name=&quot;text&quot; value=&quot;&quot; size=&quot;10&quot;&gt; <br/>&gt; &lt;input type=&quot;submit&quot; value=&quot;submit&quot;&gt; <br/>&gt; &lt;/form&gt; <br/>&gt; &lt;/body&gt; <br/>&gt; &lt;/html&gt; <br/>&gt; <br/>&gt; #!/usr/local/bin/perl <br/>&gt; use CGI; <br/>&gt; $query = CGI::new(); <br/>&gt; $search_term = $query-&gt;param(&#39;text&#39;); <br/>&gt; $sans_diacritics = $search_term; <br/>&gt; $sans_diacritics =~ s/\p{M}*//g; <br/>&gt; #$sans_diacritics =~ s/o//g; <br/>&gt; print qq(Content-type: text/plain; charset=utf-8 <br/>&gt; <br/>&gt; $sans_diacritics <br/>&gt; ); <br/>&gt; exit(0); <br/>&gt; <br/>&gt; <br/>&gt; In the form, I&#39;m inputting the string &quot;Barto&Igrave;&#129;k&quot; with the <br/>&gt; accented character being a base character (small Latin letter <br/>&gt; &quot;o&quot;) followed by a combining acute accent. However, when I <br/>&gt; print (to the web) $sans_diacritics, I get my input with no <br/>&gt; change -- the combining diacritic is still there. I know <br/>&gt; that my input is not a precomposed accented character, <br/>&gt; because I can strip out the base &quot;o&quot; and the combining accent <br/>&gt; either stands alone or jumps to another character [2]. <br/>&gt; <br/>&gt; The &quot;\p{M}&quot; is a Unicode class name for the character class <br/>&gt; of Unicode &#39;marks&#39;, for example accent marks [1]. I&#39;ve tried <br/>&gt; these variations (and many others) and none seem to be doing <br/>&gt; what I want: <br/>&gt; <br/>&gt; $sans_diacritics =~ s#[\p{Mark}]*##g; <br/>&gt; $sans_diacritics =~ tr#[\p{InCombiningDiacriticalMarks}]##; <br/>&gt; $sans_diacritics =~ tr#[\p{M}]##; <br/>&gt; $sans_diacritics =~ s/\p{M}*//g; <br/>&gt; $sans_diacritics =~ s#[\p{M}]##g; <br/>&gt; $sans_diacritics =~ s#\x{0301}##g; <br/>&gt; $sans_diacritics =~ s#\x{006F}\x{0301}##g; <br/>&gt; $sans_diacritics =~ s#[\x{0300}-\x{036F}]*##g; <br/>&gt; <br/>&gt; I&#39;m pulling my hair out on this... so any help would be <br/>&gt; appreciated. If there&#39;s any other info I can provide, let me know. <br/>&gt; <br/>&gt; My Perl version is 5.8.8 and the script is running on a <br/>&gt; server running Solaris 9. <br/>&gt; <br/>&gt; -- Michael <br/>&gt; <br/>&gt; [1] per http://perldoc.perl.org/perlretut.html and other documentation <br/>&gt; <br/>&gt; [2] using $sans_diacritics =~ s/o//g; <br/>&gt; <br/>&gt; # Michael Doran, Systems Librarian <br/>&gt; # University of Texas at Arlington <br/>&gt; # 817-272-5326 office <br/>&gt; # 817-688-1926 mobile <br/>&gt; # doran@uta.edu <br/>&gt; # http://rocky.uta.edu/doran/ <br/>&gt; <br/></p> 2008-05-07T06:51:13Z Re: Stripping out Unicode combining characters (diacritics) by David Kaufman

From: David Kaufman Hi Michael,

"Doran, Michael D" <doran@uta.edu> wrote:

> I'm trying to strip out combining diacritics from some form input using
> this code:
> [...]
> $sans_diacritics =~ s/\p{M}*//g;

I do it like this:

use Encode;
use Unicode::Normalize qw(normalize);

my $ascii = encode('ascii', normalize('KD', $utf8), sub { $_[0]='' });



2008-05-07T03:33:53Z
RE: Stripping out Unicode combining characters (diacritics) by Doran, Michael D <p>From: Doran, Michael D Hi Leif,<br/><br/>&gt; This is what I do. You can try that.<br/>&gt; See if it helps:<br/>&gt; <br/>&gt; Encode::_utf8_on($str); # &lt;&lt;&lt;<br/>&gt; $str =~ s/\pM*//g;<br/><br/>That works! I will gladly buy the beers Leif, should we ever meet in person.<br/><br/>&gt; I mean - have you for instance tried running your cgi scripts <br/>&gt; in tainted mode (-T)?<br/><br/>No, I do not run my CGI scripts in tainted mode (although I realize that I probably should). <br/><br/>Thanks (once again) for your help.<br/><br/>-- Michael<br/><br/># Michael Doran, Systems Librarian<br/># University of Texas at Arlington<br/># 817-272-5326 office<br/># 817-688-1926 mobile<br/># doran@uta.edu<br/># http://rocky.uta.edu/doran/<br/> <br/><br/>&gt; -----Original Message-----<br/>&gt; From: Leif Andersson [mailto:Leif.Andersson@sub.su.se] <br/>&gt; Sent: Tuesday, May 06, 2008 3:33 AM<br/>&gt; To: Doran, Michael D<br/>&gt; Subject: Re: Stripping out Unicode combining characters (diacritics)<br/>&gt; <br/>&gt; Oh, now I see your REAL question.<br/>&gt; <br/>&gt; This is what I do. You can try that.<br/>&gt; See if it helps:<br/>&gt; <br/>&gt; Encode::_utf8_on($str); # &lt;&lt;&lt;<br/>&gt; $str =~ s/\pM*//g;<br/>&gt; <br/>&gt; You are not the only one having problems with Unicode.<br/>&gt; Esp. in web programming it can be very confusing.<br/>&gt; <br/>&gt; I am quite surprised that there are not more discussions of this kind.<br/>&gt; Not even in the &quot;official&quot; channels.<br/>&gt; <br/>&gt; I mean - have you for instance tried running your cgi scripts <br/>&gt; in tainted mode (-T)?<br/>&gt; <br/>&gt; I had all my scripts set up that way. Before Unicode.<br/>&gt; But basic Unicode stuff became broken with -T enabled.<br/>&gt; Have they fixed that now?<br/>&gt; I have at least seen no mentioning of it.<br/>&gt; <br/>&gt; And screen scraping. If you want to mess around with <br/>&gt; javascript embedded in an HTML page, you may find that the <br/>&gt; content encoding is mixed. And Perl gets very confused <br/>&gt; getting mixed character encodings.<br/>&gt; And so do I.<br/>&gt; <br/>&gt; You may also have to deal with mixed encodings doing SQL <br/>&gt; against the Voyager database.<br/>&gt; <br/>&gt; What would we do if we could not fall back on &quot;use bytes&quot;<br/>&gt; every now and then! ;-)<br/>&gt; <br/>&gt; Leif<br/>&gt; <br/>&gt; ======================================<br/>&gt; Leif Andersson, Systems Librarian<br/>&gt; Stockholm University Library<br/>&gt; SE-106 91 Stockholm<br/>&gt; SWEDEN<br/>&gt; Phone : +46 8 162769<br/>&gt; Mobile: +46 70 6904281<br/>&gt; <br/>&gt; <br/>&gt; -----Ursprungligt meddelande-----<br/>&gt; Fr&aring;n: Doran, Michael D [mailto:doran@uta.edu]<br/>&gt; Skickat: den 6 maj 2008 04:13<br/>&gt; Till: Mike Rylander<br/>&gt; Kopia: perl-i18n@perl.org; Perl4lib<br/>&gt; &Auml;mne: RE: Stripping out Unicode combining characters (diacritics)<br/>&gt; <br/>&gt; Hi Mike,<br/>&gt; <br/>&gt; I appreciate the quick reply. I am familiar with the <br/>&gt; Unicode::Normalize module (and will also be using that), but <br/>&gt; I left it out of this question because it&#39;s not relevant to <br/>&gt; the problem I&#39;m currently trying to solve. The text I&#39;m <br/>&gt; trying to strip diacritics out of does not have precomposed <br/>&gt; accented characters.<br/>&gt; <br/>&gt; -- Michael<br/>&gt; <br/>&gt; # Michael Doran, Systems Librarian<br/>&gt; # University of Texas at Arlington<br/>&gt; # 817-272-5326 office<br/>&gt; # 817-688-1926 cell<br/>&gt; # doran@uta.edu<br/>&gt; # http://rocky.uta.edu/doran/<br/>&gt; <br/>&gt; <br/>&gt; <br/>&gt; -----Original Message-----<br/>&gt; From: Mike Rylander [mailto:mrylander@gmail.com]<br/>&gt; Sent: Mon 5/5/2008 8:52 PM<br/>&gt; To: Doran, Michael D<br/>&gt; Cc: perl-i18n@perl.org; Perl4lib<br/>&gt; Subject: Re: Stripping out Unicode combining characters (diacritics)<br/>&gt; <br/>&gt; On Mon, May 5, 2008 at 8:26 PM, Doran, Michael D <br/>&gt; &lt;doran@uta.edu&gt; wrote:<br/>&gt; [snip]<br/>&gt; &gt;<br/>&gt; &gt; I&#39;m pulling my hair out on this... so any help would be <br/>&gt; appreciated. If there&#39;s any other info I can provide, let me know.<br/>&gt; &gt;<br/>&gt; <br/>&gt; You&#39;ll want to transform the text to NFD format (nominally, <br/>&gt; base characters plus combining marks) instead of NFC (precombined<br/>&gt; characters) using Unicode::Normalize:<br/>&gt; <br/>&gt; use Unicode::Normalize;<br/>&gt; <br/>&gt; my $text = NFD($original);<br/>&gt; $text =~ s/\pM+//go;<br/>&gt; <br/>&gt; Hope that helps.<br/>&gt; <br/>&gt; --<br/>&gt; Mike Rylander<br/>&gt; | VP, Research and Design<br/>&gt; | Equinox Software, Inc. / The Evergreen Experts | phone: <br/>&gt; 1-877-OPEN-ILS (673-6457) | email: miker@esilibrary.com | <br/>&gt; web: http://www.esilibrary.com<br/>&gt; <br/>&gt; <br/></p> 2008-05-06T07:26:50Z