Unformatted text preview:

Merging Changes in XML Documents Using Reliable Context Fingerprints Sebastian R nnau Christian Pauli Uwe M Borghoff Institute for Software Technology Universit t der Bundeswehr M nchen Werner Heisenberg Weg 39 85577 Neubiberg Germany Sebastian Roennau unibw de ABSTRACT and systems try to facilitate the exchange of documents taking technical and organizational measures to guarantee a consistent state This belongs to the research body known as computer supported cooperative work CSCW For office applications XML has emerged as lingua franca OpenOffice Microsoft Office and many other applications use XML dialects for serializing and exchanging documents Therefore in this paper we consider XML documents only Several metrics exist for categorizing XML aware CSCW systems Among others they can be divided in operationbased and state based systems 12 13 The main advantage of the operation based approach is that it retains information about the evolution of a document state thus allowing to perform a fine grained merge However in the office domain this advantage turns into a major drawback Persons or organizations often want to hide their editing process only accepting to exchange a document in an approved state Therefore we focus our work on the state based approach Version control systems can serve as an example for a state based CSCW system The question how version control systems and XML based office documents interact has been discussed in 25 where an XML aware diff tool is used to compute the changes between two versions of a document Several implementations of such a tool have been proposed However only two approaches are sufficiently efficent 7 20 Two way diffs can only be applied if documents are evolved in a linear fashion In most collaboration and versioning scenarios however it is crucial to be able to merge changes performed independently on a document 1 As a solution a three way diff could be used which compares the changed versions of a document with their nearest common ancestor 13 An XML aware implementation was proposed too 18 However these approaches require all three versions to be available In ad hoc environments and loose collaboration sytems this precondition will not hold in general 24 Lowbandwidth connections for example over satellite systems do not allow the transfer of the complete version either Another solution would be to apply a delta to a version of the document it was not computed for an approach which is commonly used in the domain of line based edit operations 11 Due to the fact that deltas mostly use absolute paths to identify an edit operation this approach is both naive and unusable in the domain of XML documents Figure 1 shows a simple example where an edit operation affects the addresses of the subsequent nodes In order to avoid such effects we present a technique to compute fingerprints of the context of an edit operation us Different dialects of XML have emerged as ubiquitous document exchange formats For effective collaboration based on such documents the capability to propagate edit operations performed on a document is indispensable In order to avoid the transmission of whole documents deltas are used to describe these edit operations allowing the construction of a new version of a document However patching a document with a delta it was not generated for is error prone and any insert or delete operations performed on the document are likely to affect all subsequent paths within that document In this paper we present a delta format for XML documents that uses context aware fingerprints to identify edit operations This allows our XML patch procedure to find the correct position of an edit operation even if the document was updated in the meantime Possible conflicts are detected Experimental results show the reliability of the presented fingerprinting technique and prove the high quality of the resulting patched documents Categories and Subject Descriptors I 7 1 Document and Text Processing Document and Text Editing Document management Version control General Terms Algorithms Management Reliability Keywords CSCW XML diff XML patch fingerprint office applications version control 1 INTRODUCTION In office work collaborative editing of documents is an every day task Several persons write seperate parts other persons review and comment them Many different tools Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page To copy otherwise to republish to post on servers or to redistribute to lists requires prior specific permission and or a fee DocEng 08 September 16 19 2008 S o Paulo Brazil Copyright 2008 ACM 978 1 60558 081 4 08 09 5 00 52 00 A A A00 ins 1 doc p par 1 p p par 2 p p par 3 p doc A A0 del 0 A doc p par 2 p p par 3 p doc A doc p par 1 p p par 3 p doc 0 A A0 del 0 A doc p par 3 p doc A A00 ins 1 doc p par 3 p p par 2 p doc Figure 1 Applying a delta to a document version it was not computed for leads to unwanted results easily Performing the delta marked with a dashed line would create a wrong document version ing hash values This allows the patch procedure to identify the correct position of an edit operation using its context even if it has moved in the meantime Thus patching an XML document with a delta it was not computed for becomes possible and reliable The remainder of this paper is organized as follows We define our XML model and a delta model in Section 2 In Section 3 we propose a fingerprinting technique using the context of an edit operation and a delta format using it A patch procedure based upon this delta format is described in Section 4 Section 5 demonstrates the benefits of our approach using experimental results After an examination of related work in Section 6 we conclude the paper and give an outlook on future work in Section 7 2 number of nodes to walk in document order from i to j 8 if i j n A i n j 1 dist i j n A j n i if j i 0 otherwise A key question in the design of an XML delta format is how to define the address of an edit operation With XPath a powerful language for addressing nodes within XML documents exists 6 However an XPath expression typically returns a set of nodes whereas an edit must act on a unique node This problem can be avoided since XPath allows to address single nodes on a


View Full Document

Pace CS 835 - Merging Changes in XML Documents Using Reliable Context Fingerprints

Loading Unlocking...
Login

Join to view Merging Changes in XML Documents Using Reliable Context Fingerprints and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Merging Changes in XML Documents Using Reliable Context Fingerprints and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?