Archon: Importing MaRC Records
June 4, 2007
As an experiment, I tried importing a file of MaRC records from our online catalog into Archon. The process was straightforward, but the results were not as good as I hoped.
The test file contained 233 records, describing collections at a variety of levels ranging from a single-page letter to a 3000-box record group. Both official governmental records and personal papers were included, but not any printed material or maps.
Pluses:
- Easy-to-follow instructions made it a quick process
- Fast way to add at least minimal records to the system
Minuses:
- Only certain subfields were imported, leading to truncated entries
- Duplicated titles and/or classifications prevented many records from being imported
- Records without field 245|a were not imported
- Not all of the notes and subject headings transferred
- Error messages did not clearly specify the problems and the culprit records
Subfields skipped
The import script only looks at certain subfields. For example, in the 245 (title) field any subfield b (subtitle) is silently dropped–only |a transfers to the Archon title field.
This truncation is particularly noticeable on subject headings. For example, one record had the heading “610 10 New Mexico.|bGovernor.” What made it into Archon was “610 0 New Mexico.”
Duplicate titles and classifications
Archon refused to import more than one record with the same title, and it used 245|a to determine uniqueness. The test file contained six records entitled “Letter”; each had a subtitle identifying the author, recipient, date, etc. Archon imported the first record, and for the remainder declared “Could not store Collection: A Collection with the same title, sorttitle and repository code already exists in the database.”
The McEntire-Brook family papers and the Richard McEntire papers share the same classification number, being physically housed together although described separately. “Could not store Collection: A Collection with the same classification and collection identifier already exists in the database.”
Records without 245 |a
Records whose title field lacked a subfield a, which is permitted under OCLC rules for archival material, could not be imported. We have a number of records that have only a form heading in subfield k (e.g., “|k Diary, |f 1868-1869″), and since the import script only looks at subfield a it could not find a valid title.
Subject headings and note fields
Many many subject headings were silently dropped. One record had 12 subject headings (6xx) in the MaRC record; 3 were imported into Archon, and I have no idea why the others failed.
[UPDATE: Yes, I do have an idea. The import script can only handle one of each MaRC tag field--one 650 topical subject heading, one 651 geographic heading, one 700 added personal author, one 541 acquisitions note, etc. For repeatable fields, the extras are lost. If the record contains two or more 650 fields, for example, only one will be imported. (import-marc.php.inc, line 125--see the PHP manual for alternatives to array_merge)]
Only certain note fields and subfields (list below) were imported; the remainder simply vanished without warning. These six are the only 5xx fields mapped to an Archon database field. It looks like additional mappings could be added fairly easily to the importer script (import-marc.inc.php), but I did not try this.
Note fields imported:
- 506 |a [Access Restrictions]
- 520 |a [Scope]
- 541 |a [Acquisition Source]
- 541 |c [Acquisition Method]
- 541 |d [Acquisition Date Year]
- 561 |a [Custodial History]
Error messages
When I ran the import, the message screen listed the title of every record successfully imported, and one or more errors for those that failed. Unfortunately, it did not list the title or any other identifying information for the those that failed, just the error generated. A sample from the log:
Imported Adair collection
Imported McEntire-Brooke family papers
Could not store Collection: A Collection with the same classification and collection identifier already exists in the database.
Imported Credit journal
Could not add Collection: Unable to insert into the database table
Only by having a list of the titles in the file, in record order, could I determine that the first error message above referred to the Richard McEntire papers, and the second to the William Henry Avery congressional papers. Moreover, the message is misleading: the Avery papers were added, but some (as yet identified) problem prevented adding any notes or subject headings.
Conclusions
Of the 233 records in the test file, Archon imported 184. None, however, were imported in their entirety.
The ability to import MaRC records is a terrific feature for Archon. Unfortunately, it’s not quite ready for primetime, at least in our situation. Many, perhaps most, of the problems identified above could be fixed fairly readily by someone familiar with the php scripting language. (Some, in fact, may not be “problems” so much as differing cataloging practices and expectations.) In the meantime, the MaRC import serves only to bring skeletal records into Archon, not to import full MaRC records.
Appendix: Sample before-and-after record
MaRC record from the test file:
099 Ms.|aColl. 757
100 1 Franklin, Margaret Barnum,|d1905-1997
245 10 Margaret B. Franklin Papers,|f1883-1992.|g(bulk 1900-1935)
300 11|fdocument cases (5 cubic feet)
351 Organized into four series: I. Biographical Notes; II. Origins: Chautauqua, N.Y.; III. The Rural cultural
Movement; IV. Afterward
520 The Franklin Papers, donated by Mrs. Margaret Franklin who was actively involved in the chautauqua movement, focus on
the famed circuit chautauquas almost exclusively. This collection is divided into four series -- the first
pertains to Mrs. Franklin personally; Series II focuses on the Chautauqua Institute of Chautauqua Lake, New York;
Series III centers on the travelling circuit chautauquas; and Series IV covers this previously famous element of
Americana from a reminiscent view-point. The collection includes administrative records, financial records, talent
advertisement, chautauqua programs, articles and a miscellany of other material. Mrs. Franklin's donation
provides a wealth of information and material on the chautauqua movement
541 |cGift:|aMrs. Margaret B. Franklin;|d1995
545 0 Margaret Lavona (Barnum) Franklin was born in Caldwell, Kan. in 1905. In her adulthood, she was a school teacher
in a variety of places in Nebraska and Iowa. For several years, she either performed in or worked for the
chautauqua systems. She performed in at least two singing groups: the Marine Maids and Uncle Sam's Nieces. She
worked for the chautauqua systems as both a junior supervisor and as advance girl. In 1940, she married
Charles Benjamin Franklin, the president of the Associated Chautauquas of America, and with him, had two children:
Margaret Lee Franklin and Benjamin Barnum Franklin. The Franklins lived in a variety of places in the Midwest and
the east, but eventually settled in Topeka, Kan
545 Originally, Chautauqua was the name of a lake in western New York. This became the location for a Sunday school
teachers' training seminar, held in an out-door setting and conducted interdenominationally by renown Bible
scholars. Over the years, the schedule of events included more secular speakers, performers and entertainers.
However, the Chautauqua Institute never strayed from its educational and cultural focus. This idea became
immensely popular in other parts of the United States, especially in the recently settled rural, western states.
Within a few years, "chautauquas," like the first one at Lake Chautauqua, New York, could be found in a number of
locations around the country. By the end of the nineteenth century, chautuaqua companies organized to
contract talented individuals and groups and to take them "on the circuit" to towns and rural, park-like settings,
performing inside huge tents. For literally millions of people in the sparcely settled west and midwest, this was
their only form of cultural enlightenment and entertainment. However, by the early 1930s, the once
popular chautauqua movement was declining. As technological advances provided newer forms of
entertainment (the radio and sound movies) and as the once sparcely populated states gained more people --
enabling previously small towns to support their own theatres, dance halls, and libraries, the chautauquas
eventually found it impossible to compete with other diversions. By 1934, the last circuit chautauqua closed
down
555 8 Finding aid available in repository
600 10 Franklin, Charles Benjamin,|d1891-1983
650 0 Chautauquas
650 0 Education
650 0 Lectures and lecturing
651 0 Chautauqua Lake (N.Y.)
691 Topeka (Kan.)
MaRC record produced by Archon:
099 _aColl. 757
100 1 _aFranklin, Margaret Barnum
_d1905-1997
245 00 _aMargaret B. Franklin Papers
_f1883-1992
_g(bulk 1900-1935)
300 _a11.00
351 _aOrganized into four series: I. Biographical Notes; II. Origins: Chautauqua, N.Y.; III.
The Rural cultural Movement; IV. Afterward
520 2 _aThe Franklin Papers, donated by Mrs. Margaret Franklin who was actively involved in the chautauqua movement, focus on
the famed circuit chautauquas almost exclusively. This collection is divided into four series -- the first
pertains to Mrs. Franklin personally; Series II focuses on the Chautauqua Institute of Chautauqua Lake, New York;
Series III centers on the travelling circuit chautauquas; and Series IV covers this previously famous element of
Americana from a reminiscent view-point. The collection includes administrative records, financial records, talent
advertisement, chautauqua programs, articles and a miscellany of other material. Mrs. Franklin's donation
provides a wealth of information and material on the chautauqua movement
541 _aMrs. Margaret B. Franklin
_cGift:
_d1995
600 0 _aFranklin, Charles Benjamin
650 0 _aChautauquas
651 0 _aChautauqua Lake (N.Y.)

June 11, 2007 at 2:43 pm
Thanks for posting this informative result of the test–I am not all that surprized that the records did not fully import. The MARC importer we wrote was very much a bare-bones operation, but it can easily be tweaked to import records correctly in line with local practices. The file you need to tweak is import-marc.inc.php in the admin/databases folder. This is an area where the community of implementors could really help us out by improving the script so that it will deal with more MARC records. I wish we had enough time to work on this issue ourselves, but since we don’t have any MARC records at UIUC to import, it is not a high priority for us. However, if anyone else would like to take a shot at improving the script, we’ll be happy to include in the next release.