New version page

book03

Upgrade to remove ads

This preview shows page 1-2-3-4-5 out of 15 pages.

Save
View Full Document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience

Upgrade to remove ads
Unformatted text preview:

Schema Conversion Methods between XMLand Relational ModelsDongwon Lee, Penn State UniversityMurali Mani and Wesley W. Chu, University of California, Los [email protected], [email protected], [email protected] In this chapter, three semantics-based schema conversion methods are pre-sented: 1) CPI converts an XML schema to a relational schema while preserving se-mantic constraints of the original XML schema, 2) NeT derives a nested structuredXML schema from a flat relational schema by repeatedly applying the nest operatorso that the resulting XML schema becomes hierarchical, and 3) CoT takes a relationalschema as input, where multiple tables are interconnected through inclusion depen-dencies and generates an equivalent XML schema as output.1 IntroductionRecently, XML [1] has emerged as the de facto standard for data formats on the web. Theuse of XML as the common format for representing, exchanging, storing, and accessing dataposes many new challenges to database systems. Since the majority of everyday data is stillstored and maintained in relational database systems, we expect that the needs to convert dataformats between XML and relational models will grow substantially. To this end, severalschema conversion algorithms have been proposed (e.g., [2, 3, 4, 5]). Although they workwell for the given applications, the XML-to-Relational or Relational-to-XML conversion al-gorithms only capture the structure of the original schema and largely ignore the hiddensemantic constraints. To clarify, consider the following DTD that models conference publi-cations:<!ELEMENT conf(title,soc,year,mon?,paper+)><!ELEMENT paper(pid,title,abstract?)>Suppose the combination of title and year uniquely identifies the conf. Using thehybrid inlining algorithm [4], the DTD would be transformed to the following relationalschema:conf (title,soc,year,mon)paper (pid,title,conf_title,conf_year,abstract)While the relational schema correctly captures the structural aspect of the DTD, it does notenforce correct semantics. For instance, it cannot prevent a tuple t1: paper(100,’DTD...’,’ER’,3000,’...’) from being inserted. However, tuple t1is inconsistent with the se-mantics of the given DTD since the DTD implies that the paper cannot exist without be-ing associated with a conference and there is apparently no conference “ER-3000” yet. In2 D. Lee, M. Mani, and W. W. Chudatabase terms, this kind of violation can be easily prevented by an inclusion dependencysaying “paper[conf title,conf year] ⊆ conf[title,year]”.The reason for this inconsistency between the DTD and the transformed relational schemais that most of the proposed conversion algorithms, so far, have largely ignored the hiddensemantic constraints of the original schema.1.1 Related WorkSchema Conversion vs. Schema Matching: It is important to differentiate the problem thatwe deal with in this chapter, named as schema conversion problem, from another similar oneknown as schema matching problem. Given a source schema s1and a target schema t1, theschema matching problem finds a “mapping” that relates elements in s1to ones in t1. On theother hand, in the schema conversion problem, only a source schema s2is given and the goalis to find a target schema t2that is equivalent to s2. Often, the source and target schemasin the schema matching problem belong to the same data model1(e.g., relational model),while they belong to different models in the schema conversion problem (e.g., relational andXML models). Schema matching problem itself is a difficult problem with many importantapplications and deserves special attention. For further discussion on the schema matchingproblem, refer to [6] (survey), [7] (latest development), etc.Between XML and Non-relational Models: Schema conversion between different modelshas been extensively investigated. Historically, the trend for schema conversion has alwaysbeen between consecutive models or models with overlapping time frames, as they haveevolved (e.g., between Network and Relational models [8, 9], between ER and OO mod-els [10, 11], or between UML and XML models [12, 13, 14, 15]). For instance, [16] dealswith conversion problems in OODB area; since OODB is a richer environment than RDB,their work is not readily applicable to our application. The logical database design methodsand their associated conversion techniques to other data models have been extensively stud-ied in ER research. For instance, [17] presents an overview of such techniques. However,due to the differences between ER and XML models, those conversion techniques need to bemodified substantially. In general, since works developed in this category are often ad hocand were aimed at particular applications, it is not trivial to apply them to schema conversionbetween XML and relational models.From XML to Relational: From XML to relational schema, several conversion algorithmshave been proposed recently. STORED [2] is one of the first significant attempts to store XMLdata in relational databases. STORED uses a data mining technique to find a representativeDTD whose support exceeds the pre-defined threshold and using the DTD, converts XMLdocuments to relational format. Because [18] discusses template language-based conversionfrom DTD to relational schema, it requires human experts to write an XML-based conversionrule. [4] presents three inlining algorithms that focus on the table level of the schema con-versions. On the contrary, [3] studies different performance issues among eight algorithmsthat focus on the attribute and value level of the schema. Unlike these, we propose a methodwhere the hidden semantic constraints in DTDs are systematically found and translated into1There are cases where schema matching problem deals with a mapping between different data models (e.g.,[6]), but we believe most of such cases can be replaced by: 1) a schema conversion between different models,followed by 2) a schema matching within the same model.Schema Conversion Methods between XML and Relational Models 3relational formats [19]. Since the method is orthogonal to the structure-oriented conversionmethod, it can be used along with algorithms in [2, 18, 4, 3].From Relational to XML: There have been different approaches for the conversion from therelational model to XML model, such as XML Extender from IBM, XML-DBMS, SilkRoute [20],and XPERANTO [5]. All of the above tools require the user to specify the mapping from thegiven relational schema to XML


Download book03
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view book03 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view book03 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?