DOC PREVIEW
SJSU CMPE 226 - XML Databases Notes

This preview shows page 1-2-3 out of 10 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Requirements for XML Document Database SystemsAiri SalminenDept. of Computer Science and Information SystemsUniversity of JyväskyläJyväskylä, [email protected] Wm. TompaDepartment of Computer ScienceUniversity of WaterlooWaterloo, ON, Canada+1-519-888-4567 ext. [email protected] shift from SGML to XML has created new demands formanaging structured documents. Many XML documents will betransient representations for the purpose of data exchangebetween different types of applications, but there will also be aneed for effective means to manage persistent XML data as adatabase. In this paper we explore requirements for an XMLdatabase management system. The purpose of the paper is not tosuggest a single type of system covering all necessary features.Instead the purpose is to initiate discussion of the requirementsarising from document collections, to offer a context in which toevaluate current and future solutions, and to encourage thedevelopment of proper models and systems for XML databasemanagement. Our discussion addresses issues arising from datamodelling, data definition, and data manipulation.Categories and Subject DescriptorsH.2.4 [Database Management]: Systems – textual databases;I.7.1 [Document and Text Processing]: Document and TextEditing – document management.General TermsManagement, Design.KeywordsXML, structured documents, XML database systems, datamodelling, data definition, data manipulation.1. INTRODUCTIONSGML has been a widely used markup language for defining andrepresenting structured documents since its publication in 1986[30]. The ongoing shift from SGML to XML is creating newdemands for the management of structured documents. Comparedto SGML, the variety of applications expected to use XML ismuch wider. On the one hand, XML will have an extended use inthe application areas where SGML and HTML have already beencommonly used, for example, in technical documentation ofmanufacturing companies, book publishing, and Web publishing.On the other hand, XML will also be used in ways SGML andHTML were not, most notably as the data exchange formatbetween different applications. As was the situation withdynamically created HTML documents, in the new areas there isnot necessarily a need for persistent storage of XML documents.Often, however, document storage and the capability to presentdocuments to a human reader as they are or were transmitted isimportant to preserve the communications among different partiesin the form understood and agreed to by them.Effective means for the management of persistent XML data as adatabase are needed. We define an XML document database (ormore generally an XML database, since every XML databasemust manage documents) to be a collection of XML documentsand their parts, maintained by a system having capabilities tomanage and control the collection itself and the informationrepresented by that collection. It is more than merely a repositoryof structured documents or of semistructured data. As is true formanaging other forms of data, management of persistent XMLdata requires capabilities to deal with data independence,integration, access rights, versions, views, integrity, redundancy,consistency, recovery, and enforcement of standards.A problem in applying traditional database technologies to themanagement of persistent XML documents lies in the specialcharacteristics of the data, not typically found in traditionaldatabases. Structured documents are often complex units ofinformation, consisting of formal and natural languages, andpossibly including multimedia entities. The units as a whole maybe important legal or historical records. The production andprocessing of structured documents in an organization may createa complicated set of documents and their components, versionsand variants, covering both basic data and metadata. Thus, toaccommodate structured documents and support typicalapplications’ needs, Arnold-Moore, Fuller, and Sacks-Davis havedescribed a structured document management system as an“authoritative document repository” that includes the followingfeatures [5]:• on-the-fly creation of renditions• automatic transformations• access control at the element level• access to elements (component versioning)• intensional versioning• human-readable description of changes• extended search capabilities• document-based workflowHowever, XML imposes yet further demands:• Closely related W3C specifications that extend thecapabilities specified in XML 1.0 [12], such as XMLNamespaces [11], XML Schema [8, 27, 48], and XLink [24],should be accommodated when developing XML databasesolutions. The accommodation should adapt to thecontinuing development and re-development of thespecifications.• XML is intended especially for use on the Internet.References in XML documents refer to Internet resources,and thus XML database systems should include Internetresource management. In the Internet environmentsintegration of the management of structured documents withthe management of other kinds of documents and data is alsoimportant.• An SGML document was always associated with a DTD1,and the DTD could be used in many different ways tosupport the data management. XML documents do notalways have an associated DTD.The database research community has been actively investigatingXML (see, for example, [1] and [49]). Much of the effort has beendirected at using XML as a database wrapper and mediationmedium, using XML to describe Web resources, storing andindexing XML in traditional database systems, understanding theinteraction of DTDs with constraint and typing mechanisms, anddesigning query languages for XML. In an influential paper,Maier examined XML query language proposals from thedatabase perspective [38], but broader management issuespeculiar to XML databases have not yet received much attention.2. THE DATA MODELA well-defined database system is based on a well-defined datamodel. The complexity of XML-related data repositories and theneed to integrate the management of structured documents withthe management of other types of data creates a special challengefor the underlying data model. In research papers the XML datamodel is often simplified to a labeled tree, or a directed graph,including elements with their character data, and attributes withtheir values. Sometimes the elements are ordered (e.g. [28]), andother times they are not (e.g. [3]). This kind of simplified modelmay be


View Full Document

SJSU CMPE 226 - XML Databases Notes

Documents in this Course
SQL-99

SQL-99

71 pages

XML

XML

52 pages

XML

XML

14 pages

Chapter 9

Chapter 9

45 pages

Load more
Download XML Databases Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view XML Databases Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view XML Databases Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?