R TREES A DYNAMIC INDEX STRUCTURE FOR SPATIAL SEARCHING Antomn Guttman University of Cahforma Berkeley Abstract In order to handle spatial data efficiently as required in computer aided design and geo data applications a database system needs an mdex mechanism that ti help it retrieve data items quickly accordmg to their spatial locations However traditional mdexmg methods are not well suited to data oblects of non zero size located m multidimensional spaces In this paper we describe a dynarmc mdex structure called an R tree winch meets this need and give algorithms for searching and updatmg it We present the results of a series of tests which indicate that the structure performs well and conclude that it is useful for current database systems m spatial applications 1 Intxoduction Spatial data oblects often cover areas m multi dimensional spaces and are not well represented by pomt locations For example map objects like counties census tracts etc occupy regions of non zero size m two dnnenslons A common operation on spatial data 1s a search for all oblects m an area for example to find all counties that have land mthm 20 nnles of a particular pomt Tl s kmd of spatial search occurs frequently m computer tided design CAD and geo data applications and therefore it 1s unportant to be able to retneve oblects efficiently according to their spatial location An mdex based on obJects spatial locations 1s desirable but classical onedunenaonal database mdexmg structures are not appropriate to multi dimensional spatial searchmg Structures based on exact matchmg of values such as hash tables are not useful because a range search 1s requed Structures usmg onednnenslonal ordermg of key values such as B trees and ISAM mdexes do not work because the search space is multldnnenslonal A number of structures have been proposed for handling muhi dimensional point data and a survey of methods can be found m 5 Cell methods 4 8 16 are not good for dynamic structures because the cell boundmes must be decided m advance Quad trees i and k d trees 3 do not take pagmg of secondary memory into account K D B trees 13 are designed for paged memory but are useful only for pomt data The use of mdex mtervals has been suggested m 15 but tlus method cannot be used m multiple dnnensions Corner stitchmg 121 is an example of a structure for two dimensional spatial searchmg smtable for data objects of nonzero size but it assumes homogeneous pr mary memory and 1s not e lent for random searches m very large collections of data Grid files lo handle non pomt data by mapping each object to a point in a llus research was sponsored by National Science Foundation grant ECS 8300463 and An Force Ofi ce of Scientific Research grant AFOSR 83 0254 Pcrnuwon prowled to copy that the mthout commcrctal advantage pubbcauon and pcrnusslon of the othc sc 0 1984 the its date ACM appear Assoctauon or to rcpubbsh ACM fee all copses are not of part made rqmrcs matcnal for copyright and for of tlus or dmtnbutai nottce nohcc Computmg a fee and or O 89791 128 8 84 006 0047 and u gwcn that Macluncry spcctfii Is granted drrcct the tltk of the copying To L by copy pernuwon 00 75 47 higher cllmenslonal space In this paper we descnbe an alternative structure called an R tree wmch represents data objects by mtervals in several dnnenslons Section 2 outhnes the structure of an R tree and Section 3 gives algornhms for searchmg msertmg deletmg and updatmg Results of R tree mdex performance tests are presented m Section 4 Section 5 contams a summary of our conclusions Every leaf node contalns between m and Y mdex records unless it 1s the root each mdex record 2 For I tuple zdent er m a leaf node I 1s the smallest rectangle that spatially contams the n dnnenslonal data obJect represented by the mdlcated tuple 3 Every non leaf node has between m and M chndren unless it 1s the root ur a 4 For each entry I child poznter non leaf node I 1s the smallest rectangle that spatially contams the rectangles m the child node 5 The root node has at least two cmdren unless it is a leaf 6 All leaves appear on the same level Figure 2 la and 2 lb show the structure of an R tree and illustrate the contamment and overlappmg relatlonshps that can exist between its rectangles The height of an R tree co tamm N index records is at most pg 1 4 because the branchmg factor of each node is at least rn The maximum number of Cl1 2 R Tree Index Structure An R tree 1s a height balanced tree slrmlar to a B tree Z 61 Pnth mdex records in its leaf nodes contammg pomters to Nodes correspond to disk data objects pages If the mdex 1s Sk resident and the structure 1s designed so that a spatial search requnes visltmg only a small number of nodes The mdex 1s completely dynannc inserts and deletes can be mterrmxed pnth searches and no penodlc reorgamzatlon 1s requn ed A spatial database consists of a collection of tuples representmg spatial objects and each tuple has a umque ldenttier wluch can be used to retneve it Leaf nodes m an R tree contam mdex record entnes of the form nodes I tupte w enCtfier 1s 1 Worst case space ut at on for all nodes except the where tu e cdentijier refers to a tuple m root is m M Nodes pvlll tend to have more the database and I 1s an n dunenaonal than m entnes and ths will decrease tree rectangle wlvch 1s the boundmg box of the height and nnprove space utfizatlon If spatial object mdexed nodes have more than 3 or 4 entnes the tree 1s very mde and almost all the space mdex Here n 1s the number of dnnenaons and JT 1s used for leaf nodes con rung records The parameter m can be vaned is a closed bounded mterval a b descnbas part of performance tumng and mg the extent of the object along dnnendflerent values are tested expenmentally 4 may have one or sion i Alternatively m Section 4 both endpoints equal to mfhuty mdlcatmg outward object extends the that 3 Searchmg and Updating contam Non leaf nodes mdefimtely entnes of the form 3 1 Searching The search algorithm descends the tree I child powder from root m a manner snnrlar to a Bwhere chdd poznter 1s the address of a tree the However more than one subtree lower node in the R tree and I covers all under a node vlslted may need to be rectangles m the lower node s entnes searched hence It 1s not possible to Let Y be the maxmum number of guarantee good worst case performance entn3 that snll At m one node and let Nevertheless w h most kmds of data the update algonthms ti mamtam the tree m ml 2 be a parameter speclfymg the a form
View Full Document
Unlocking...