1 Introduction2 Extensible Indexing2.1 Declarative Integration2.2 Relational Implementation3 Related Work3.1 Non-replicating Spatial Access Methods3.2 Replicating Spatial Access Methods4 Management of Interval Sequences4.1 The Relational Interval Tree 14.2 Interval Query Processing4.3 Gap Optimization for Interval Sequences4.4 Integrating Inner Queries4.5 Final Optimized Algorithm5 Spatially Extended Objects5.1 Mapping Extended Objects to Interval Sequences5.2 Controlling Accuracy and Redundancy6 Empirical Evaluation6.1 Experimental Setup6.2 Redundancy, Accuracy, and Storage6.3 Query Processing6.4 Impact of Space-Filling Curves7 ConclusionsInterval Sequences: An Object-Relational Approach to Manage Spatial DataHans-Peter Kriegel, Marco Pötke, and Thomas SeidlUniversity of Munich, Institute for Computer ScienceOettingenstr. 67, 80538 Munich, GermanyTel. +49-89-2178-2191, Fax: +49-89-2178-2192{kriegel,poetke,seidl}@dbs.informatik.uni-muenchen.deAbstract. The design of external index structures for one- and multidimensionalextended objects is a long and well studied subject in basic database research.Today, more and more commercial applications rely on spatial datatypes andrequire a robust and seamless integration of appropriate access methods intoreliable database servers. This paper proposes an efficient, dynamic and scalableapproach to manage one-dimensional interval sequences within off-the-shelfobject-relational database systems. The presented technique perfectly fits to theconcept of space-filling curves and, thus, generalizes to spatially extended objectsin multidimensional data spaces. Based on the Relational Interval Tree, the methodis easily embedded in modern extensible indexing frameworks and significantlyoutmatches Linear Quadtrees and Relational R-trees with respect to usability,concurrency, and performance. As demonstrated by our experimental evaluation onan Oracle server with real GIS and CAD data, the competing methods areoutperformed by factors of up to 4.6 (Linear Quadtree) and 58.3 (Relational R-tree)for query response time.1 IntroductionAfter two decades of temporal and spatial index research, the efficient management ofone- and multidimensional extended objects has become an enabling technology formany novel database applications. The interval, or, more generally, the sequence of in-tervals, is a basic datatype for temporal and spatial data. Interval sequences are used tohandle finite domain constraints [Ram97] or to represent periods on transaction or validtime dimensions [TCG+93]. Typical applications of one-dimensional interval sequenc-es include the temporal tracing of user activity for service providers: a query like “Findall customers who were online last month between 5 and 6 pm” maps to an intersectionquery of interval sequences on a database storing online periods of all registered users.When applied to space-filling curves, interval sequences naturally represent spatiallyextended objects with even intricate shapes. By expressing spatial region queries as in-terval sequence intersections, vital operations for two-dimensional GIS and environ-mental information systems [MP94] can be supported. Efficient and scalable databasesolutions are also required for two- and three-dimensional CAD applications to copewith rapidly growing amounts of dynamic data and highly concurrent workflows. Suchapplications include the digital mock-up of vehicles and airplanes [BKP98], haptic sim-ulations [MPT99], or general engineering data management [KMPS01].C.S. Jensen et al. (Eds.): SSTD 2001, LNCS 2121, pp. 481−501, 2001. Springer-Verlag Berlin Heidelberg 2001For commercial use, a seamless and capable integration of temporal and spatial in-dexing into industrial-strength databases is essential. Unfortunately, most commerciallyrelevant database systems provide no built-in access method for temporal and spatialdatatypes [CCF+ 99], nor do they offer a generic framework to facilitate the integrationof user-defined search trees based on disk-blocks, as proposed by Hellerstein, Naughtonand Pfeffer [HNP 95]. In this paper, we therefore follow the paradigm of using relation-al access methods [KPS 00] to integrate index support for temporal and spatialdatatypes. As access methods of this class are designed on top of the pure SQL layer,they can be easily implemented on virtually any available relational database server.Replicating techniques based on the Linear Quadtree [TH 81] [OM 88] [Sam 90b][IBM 98] [Ora 99b] [RS 99] [FFS 00] decompose spatial objects into tiles which corre-spond to constrained segments on a space-filling curve. In contrast, our new techniquesupports arbitrary intervals across tile boundaries, and therefore, yields a significantlylower redundancy. It is based on the Relational Interval Tree (RI-tree) [KPS 00], a rela-tional adaption of the main-memory Interval Tree [Ede 80].The remainder of the paper is organized as follows: Section 2 reviews the benefitsand limitations of available extensible indexing frameworks. Section 3 surveys the re-lated work on relational access methods for spatial data. Section 4 describes the applica-tion of the RI-tree to store and retrieve interval sequences. Section 5 generalizes ourtechnique to multidimensional applications by mapping spatially extended objects to in-terval sequences on a space-filling curve. Following an experimental evaluation inSection 6 on 2D-GIS and 3D-CAD databases, the paper is concluded in Section 7.2 Extensible IndexingExtensible indexing frameworks, as already proposed by Stonebraker [Sto 86], enabledevelopers to extend the set of built-in index structures by custom access methods in or-der to support user-defined datatypes and predicates. This section discusses the mainproperties of extensible indexing within object-relational database systems, includingOracle8i Server [Ora 99a] [SMS+ 00], IBM DB2 Universal Database [IBM 99][CCF+ 99] or Informix Universal Server [Inf 98] [BSSJ 99].2.1 Declarative IntegrationAn object-relational indextype encapsulates stored functions for opening and closing anindex scan, iterating over resulting records, and performing insert, delete and replaceoperations on the indexed table. It is complementary to the functional implementation ofuser-defined predicates. The indextype also implements functions for the estimation ofquery selectivity and processing cost. The custom computation of persistent statisticsand histograms is triggered by the usual administrative
View Full Document