Sunday, April 18, 2010

The Metajoy of Metadata

This week the Library of Congress, Twitter, and Google made digital archiving history, as described by Wired's Ryan Singel in the Epicenter blog post Library of Congress Archives Twitter History, While Google Searches It | Epicenter | Wired.com. It makes me wonder: how will the LOC tag and catalog Tweets? Here are the perspectives of the different parties: the blog post by the LOC's tweeter Matt Raymond making the announcement, How Tweet It Is!... and the Twitter Blog post, Tweet Preservation. Google's blog describes their new Twitter search capabilities in Google Replay. The Library's Matt Raymond explains
We also operate the National Digital Information Infrastructure and Preservation Program www.digitalpreservation.gov, which is pursuing a national strategy to collect, preserve and make available significant digital content, especially information that is created in digital form only, for current and future generations.
This particular cataloging problem -- elements of the public Twitter timeline as digital content -- didn't exist five years ago when Allison Kaplan and Ann Riedling released the 2nd ed. of Catalog It! through Linworth Publishing.
Although I've been writing much more this week on internal discussion boards with classmates than I have been publishing here, I welcome the chance to explore a cataloging conversation with a wider community of librarians, cybrarians, folksonomists and anyone else interested. This text has been fascinating reading, and the exercises illustrative. I appreciate that the authors presciently point readers to FRBR, and articulate (p. 11) that "The future of cataloging is focused on the organization of metadata." As I've alluded to in internal discussions, it's not clear that I can add record data for "the dog books on Mrs. Smith's reading list" (p. 13) because of the nature of our shared network catalog. Kaplan and Riedling make clear (pp. 140f) that the MARC 590 Local Notes tag won't help in this case because it won't be indexed in the system and so won't be searchable by my students, but I may be able to use Tag 526. I'll update the post or blog when I discover from network cataloging staff if Tag 526 is indexed by our SirsiDynix system.

I'm still coming to understand MARC records in the broader context of standardized metadata. In describing its "Metadata for Digital Content" group working to meet the challenge of remediating metadata the LOC site explains
The MDC group members include catalogers, programmers and digital project managers, and represent different service units of the Library concerned with digital content. All are united by the common need for more effective descriptive metadata, which is of increasing importance for the burgeoning amounts of new digital material added to the Library’s website every day. In studying the question of "what are users looking for, and can they find it?," the group determined that the overall quality of the online bibliographic records plays a big part in success or failure. So, how can the records be structured to help users discover relevant resources when they search?  ...
The group has made considerable progress through the creation of a master list of standardized metadata elements used to map existing digital collection records to a single XML metadata scheme. The XML metadata uses the Metadata Object Description Schema.
 This official MODS website further explains that
As an XML schema it is intended to be able to carry selected data from existing MARC 21 records as well as to enable the creation of original resource description records. It includes a subset of MARC fields and uses language-based tags rather than numeric ones, in some cases regrouping elements from the MARC 21 bibliographic format.
MODS does have limitations, including some that seem significant to me:
MODS includes a subset of data from the MARC 21 Format for Bibliographic Data. As an element set that allows for the representation of data already in MARC-based systems, it is intended to allow for the conversion of core fields while some specific data may be dropped. As an element set for original resource description, it allows for a simple record to be created in some cases using more general tags than those available in the MARC record.

However, the schema does not target round-tripability with MARC 21. In other words, an original MARC 21 record converted to MODS may not convert back to MARC 21 in its entirety without some loss of specificity in tagging or loss of data. In some cases if reconverted into MARC 21, the data may not be placed in exactly the same field that it started in because a MARC field may have been mapped to a more general one in MODS. However the data itself will not be lost, only the detailed identification of the type of element it represents. In other cases the element in MARC may not have an equivalent element in MODS and then the specific data could be lost when converting to MODS.
This discussion is not as hypothetical as it may sound, as we are working in our library this year to add records of our streaming media to our catalog to make them easier for teachers and students to find. Many of our ebooks are also included in our online catalog.

025.431 : The Dewey Blog is one of my favorite blogs, and I learn something from every post, even while it reminds me that I'm not a professional cataloger. I've found that the OCLC's experimental Classify service has significantly increased my confidence with assigning Dewey numbers, especially when it reinforces my hunches or suggests another level of precision that makes sense to me. [Melvil Dewey photo (in his younger, happier years?) retrieved from http://www.fbi.fh-koeln.de/institut/projekte/ddc/DDCen/index.html 4/18/2010]

1 comment:

  1. I got a little dizzy reading this! The whole world is out there waiting to be cataloged. I had to put my head between my knees and take some deep breaths and remember that I am not the one that needs to do it. Fascinating work!

    ReplyDelete