View Full Document

Dublin Core Metadata Harvested through OIA-PMH



View the full content.
View Full Document
View Full Document

18 views

Unformatted text preview:

Amy S Jackson Myung Ja Han Kurt Groetsch Megan Mustafoff Timothy W Cole 2008 Dublin Core Metadata Harvested through OAI PMH in Journal of Library Metadata Vol 8 no 1 Preprint Dublin Core Metadata Harvested through OIA PMH Amy S Jackson Myung Ja Han Kurt Groetsch Megan Mustafoff Timothy W Cole Abstract The introduction in 2001 of the Open Archives Initiative Protocol for Metadata Harvesting OAI PMH increased interest in and awareness of metadata quality issues relevant to digital library interoperability and the use of harvested metadata to build union catalogs of digital information resources Practitioners have offered wide ranging advice to metadata authors and have suggested metrics useful for measuring the quality of shareable metadata Is there evidence of changes in metadata practice in response to such advice and or as a result of an increased awareness of the importance of metadata interoperability This paper examines metadata records harvested over a six year period by the University of Illinois at Urbana Champaign and reports on quantitative and qualitative analyses of changes observed over time in shareable metadata quality Introduction The importance of descriptive practice is not a new theme in the library domain however the wide spread adoption of the Open Archives Initiative Protocol for Metadata Harvesting OAI PMH and the Dublin Core Metadata scheme has led digital library practitioners to examine the characteristics of shareable non MARC descriptive metadata Dublin Core Metadata Page 1 of 24 records The IMLS NISO Framework of Guidance for Building Good Digital Collections first published in 2001 emphasizes the importance of disseminating descriptive metadata that supports interoperability Following the publication of this document additional concrete advice on how to create metadata well suited for sharing has been offered in several venues Digital Library Forum and the National Science Digital Library 2005 Elings Waibel 2007 Hutt Riley 2005 Shreeves Riley Milewicz 2006a Zeng Chang 2006 Dushay Hillman 2003 Less frequently discussed however is how institutions are implementing Dublin Core in practice Ward 2004 The following article discusses quantitative and qualitative observations of Dublin Core metadata records harvested by two cultural heritage service providers at the University of Illinois at Urbana Champaign UIUC The examination focuses on changes in metadata practices over time as well as observations of inaccurate and inconsistent mappings to Dublin Core Researchers originally hoped to find indications of metadata becoming more shareable as digital projects mature but findings did not support this hypothesis UIUC Metadata Portals UIUC provides access to descriptive metadata harvested with OIA PMH through several portals including the Institute of Museum and Library Services Digital Collections and Content Project IMLS DCC located at http imlsdcc grainger uiuc edu and the Committee on Institutional Cooperation CIC Metadata Portal located at http cicharvest grainger uiuc edu The IMLS DCC portal harvests metadata from cultural heritage projects funded by the Institute of Museum and Library Services Dublin Core Metadata Page 2 of 24 IMLS Eighty five percent of the records in this portal represent images and fourteen percent represent texts The IMLS DCC project staff interacted with several data providers regarding technical specifications and administrative information but gave relatively little feedback to individual metadata providers regarding metadata quality The project allowed for general presentations and publications stressing the importance of shareable metadata quality including presentations at IMLS WebWise Conferences Cole Shreeves 2004 and publications in other venues Shreeves Riley Milewicz 2006a Conversations with data providers regarding mapping best practices were not within the scope of the project The CIC Metadata Portal aggregates metadata describing resources held at participating CIC institutions Most of these objects are cultural heritage resources Construction of the CIC Metadata Portal allowed for substantial interaction between the service provider and data providers including exchange of shareable metadata and mapping best practices and feedback was given on a repositoryby repository basis Table 1 provides information regarding the size of the IMLS DCC and CIC Metadata portals IMLS DCC Portal 300 000 35 Number of records Number of contributing repositories 35 65 000 Range of records harvested from contributing repositoies 7 425 Average number of records harvested per repository 1 281 Median number of records harvested by repository Table 1 IMLS DCC and CIC Metadata Portals Dublin Core Metadata CIC Metadata Portal 630 441 28 13 300 000 25 000 6 973 Page 3 of 24 This study analyzed metadata records in the IMLS DCC portal in depth and observations from the CIC Metadata Portal confirmed IMLS DCC findings All records in this study were created between January 1 2001 and December 31 2006 and were stored and accessed on a Microsoft SQL Server SQL queries were used for the quantitative analysis and the qualitative analysis was performed by examining individual xml files as originally harvested DC and OAI PMH The decision by the OAI PMH technical committee to require Dublin Core was controversial when first made and continues to be seen as a negative in some settings Cole Foulonneau 2007 Lagoze 2004 Van de Sompel Young Hickey 2003 Chavez et al 2006 Many in the library community are concerned about its lack of richness and specificity Lagoze 2001 However one of the strengths of the schema is its ability to act as a lowest common denominator among various richer schemas and findings indicate that use of the schema is increasing in IMLS National Leadership Grant NLG digitization projects Palmer Zavalina Mustafoff 2007 The Dublin Core Metadata Element Set DCMES has fifteen elements all of which are optional and repeatable These elements are Contributor Coverage Creator Date Description Format Identifier Language Publisher Relation Rights Source Subject Title and Type A previous study Shreeves et al 2005 and best practices published by an IMLS DCC collection identify eight of these elements as significant to the completeness and of a metadata record and most helpful for search and discovery These elements are title creator subject description date format identifier and rights Dublin Core Metadata Page 4 of 24 Analysis of the IMLS DCC records


Access the best Study Guides, Lecture Notes and Practice Exams

Loading Unlocking...
Login

Join to view Dublin Core Metadata Harvested through OIA-PMH and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Dublin Core Metadata Harvested through OIA-PMH and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?