U of M CSCI 8715 - RDF Storage and Query System for Enterprise Resource Management

Unformatted text preview:

RStar: An RDF Storage and Query System for Enterprise Resource Management Li Ma, Zhong Su, Yue Pan, Li Zhang, Tao Liu IBM China Research Laboratory, No. 7, 5th Street, ShangDi, Beijing, 100085, P.R. China Emails: {malli, suzhong, panyue, lizhang, liutao}@cn.ibm.com ABSTRACT Modern corporations operate in an extremely complex environment and strongly depend on all kinds of information resources across the enterprise. Unfortunately, with the growth of an enterprise, its information resources are not only heterogeneous but also distributed in physically different systems and databases. How to effectively exploit information across the enterprise is becoming a critical but hard problem. In recent years, metadata which is the detailed description of the data is used to efficiently exploit information resources in the web. The World Wide Web Consortium (W3C) recommends the resource description framework (RDF) as a standard for the definition and use of metadata descriptions of resources in the web. In this paper, we present an RDF storage and query system called RStar for enterprise resource management. RStar uses a relational database as the persistent data store and defines RStar Query Language (RSQL) for resource retrieval. Currently, most of existing RDF storage and query systems are evaluated on small data sets and no detailed performance analysis is given for such systems. Therefore, we conduct extensive experiments on a large scale data set to investigate the performance problem in RDF storage. Such analysis will be helpful for designing RDF storage and query systems as well as for understanding not well-solved issues in RDF based enterprise resource management. In addition, experiences and lessons learned in our implementation are presented for further research and development. Categories and Subject Descriptors H.3.4 [Information Systems]: Systems and Software -- Performance evaluation. H.3.2 [Information Systems]: Information Storage. H.3.3 [Information Systems]: Information Search and Retrieval -- Query formulation. General Terms Design, Performance, Experimentation, Management Keywords Ontology, metadata, resource management, RDF storage, RDF query language. 1. INTRODUCTION Modern corporations operate in an extremely complex environment and strongly depend on all kinds of information resources across the enterprise. Unfortunately, with the growth of an enterprise, its various information resources, such as customer information and archived documents, are not only heterogeneous but also distributed in different systems and databases. How to effectively exploit resources across the enterprise is becoming a critical but hard problem. In recent years, metadata which is the detailed description of the data is used to efficiently exploit information resources in the web. The explicit use of metadata makes machines to process and understand information more easily. Metadata management has gained increasing attention with the development of the semantic web since the late 1990s [1,2]. Based on extensive discussions among many dedicated researchers and engineers, the World Wide Web Consortium (W3C) recommends the resource description framework (RDF) as a standard for the definition and use of metadata descriptions of resources in the web [3-5]. The objective of the RDF is to support the interoperability of metadata across different resource description communities. RDF based metadata management provides the enterprise a unified and powerful approach to effectively locate, interpret and transform enterprise resources distributed in physically different systems. To briefly illustrate how to use the metadata represented by the RDF to facilitate enterprise resource management, we take as an example Market Intelligence Portal (MIP), a project currently performed in IBM Research. In modern enterprises, information about documents is abundant but scattered in different databases and lacks integration. The effective use of such information can optimize business processes and thus result in considerable productivity gains for individuals and the enterprise. Market Intelligence Portal aims to provide a federated, digital representation of documents and enable virtual document management in the enterprise. The first generation of the MIP focuses on collecting and classifying documents from both the internet and the enterprise [23]. Currently, we are attempting to exploit ontology to characterize documents and their relationship for more efficient management in the next generation of the MIP. More precisely, we first construct an ontology to characterize digital attributes of a document, such as authors, date, named entities, interested users and so on. The digital attributes are in essence the metadata of a document. The RDF representation of all documents based on the defined ontology is a huge graph, maintaining the relationship among documents as well as providing information on how to use and manage related resources. The RDF graph can be regarded as a proxy and serve as control points for the capture, management and use of document Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, orrepublish, to post on servers or to redistribute to lists, requires priorspecific permission and/or a fee. CIKM’04, November 8-13, 2004, Washington D.C., U.S.A. Copyright 2004 ACM 1-58113-874-1/04/0011...$5.00. 484information. This will facilitate many operations in the enterprise, such as personalized e-learning. Based on our experiences in the project of Market Intelligence Portal, we suggest a 3-step method for enterprise resource management using the RDF representation.  Build a powerful ontology to characterize enterprise resources. The ontology not only defines attributes of each resource class but also describes the relationship among resource classes.  According to the defined ontology, extract the metadata of resources and build an RDF graph to represent them.  Store the RDF graph describing enterprise resources and provide methods to access it. Build high-level applications based on the RDF graph. The first two issues correspond to the problems of ontology building and mapping instance data to an ontology, respectively. More details


View Full Document

U of M CSCI 8715 - RDF Storage and Query System for Enterprise Resource Management

Documents in this Course
Load more
Download RDF Storage and Query System for Enterprise Resource Management
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view RDF Storage and Query System for Enterprise Resource Management and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view RDF Storage and Query System for Enterprise Resource Management 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?