Unformatted text preview:

R TREES A Dynamic Index Structure for Spatial Searching by A Guttman SIGMOD 1984 Shahram Ghandeharizadeh Computer Science Department University of Southern California Motivating Example Type in your street address in Google Example Cont Show me all the pizza places close by Terminology Example query is termed a spatial query R tree is a spatial index structure K D B trees are useful for point data only Exact point lookup R tree represents data objects in intervals in several dimensions Exact point and range lookups Show me the USC Salvatory Computer Science building Show me all Pizza places in a 2 mile radius of USC Salvatory Computer Science building R tree is A height balanced tree similar to B tree with index records in its leaf nodes containing pointers to data objects A node is a disk page Assumes each tuple has a unique identifier RID R Tree Leaf Nodes Leaf nodes contain index records I tuple identifier tuple identifier is RID I is an n dimensional rectangle that bounds the indexed spatial object I I0 I1 In 1 where n is the number of dimensions Ii is a closed bounded interval a b describing the extent of the object along dimension i Values for a and b might be infinity indicating an unbounded object along dimension i R Tree Non leaf nodes Non leaf nodes contain entries of the form I child pointer Child pointer is the address of a lower node in the R Tree I covers all rectangles in the lower node s entries R Tree A 2 D n 2 Example R Tree Non leaf nodes Non leaf nodes contain entries of the form I child pointer Child pointer is the address of a lower node in the R Tree I covers all rectangles in the lower node s entries Questions R Tree Non leaf nodes Non leaf nodes contain entries of the form I child pointer Child pointer is the address of a lower node in the R Tree I covers all rectangles in the lower node s entries Questions What is this R Tree Non leaf nodes Non leaf nodes contain entries of the form I child pointer Child pointer is the address of a lower node in the R Tree I covers all rectangles in the lower node s entries Questions Disk Page address R Tree Non leaf nodes Non leaf nodes contain entries of the form I child pointer Child pointer is the address of a lower node in the R Tree I covers all rectangles in the lower node s entries Questions How about this What is it R Tree Non leaf nodes Non leaf nodes contain entries of the form I child pointer Child pointer is the address of a lower node in the R Tree I covers all rectangles in the lower node s entries Questions An n dimensional rectangle I I0 I1 In 1 R tree Properties Assume 1 2 3 M Maximum number of entries in a node m M 2 N Number of records R tree has the following properties Every leaf node contains between m and M index records Root node is the exception For each index record I tuple identifier in a leaf node I is the smallest rectangle that spatially contains the n dimensional data object represented in the indicated tuple Every non leaf node has between m and M children Root node is the exception For each entry I child pointer in a non leaf node I is the smallest rectangle that spatially contains the rectangles in the child node The root node has at least two children unless it is a leaf All leaves appear on the same level Height of a tree Ceiling logmN 1 Worst case utilization for all nodes except the root is m M Searching Descend from root to leaf in a B tree manner If multiple sub trees contain the point of interest then follow all Assume EI denotes the rectangle part of an index entry E Ep denotes the tupleidentifier or child pointer Search T Root of the Rtree S Search Rectangle If T is not a leaf check each entry E to determine whether EI overlaps S For all overlapping entries invoke Search Ep S If T is a leaf check all entries E to determine whether EI overlaps S If so E is a qualifying record Insertion Similar to B trees new index records are added to the leaves nodes that overflow are split and splits propagate up the tree Insert T Root of the R tree E new index entry 1 2 3 4 Find position for new record Invoke ChooseLeaf to select a leaf node L in which to place E Add record to leaf node If L has room for E then insert E and return Otherwise invoke SplitNode to obtain L and LL containing E and all the old entries of L Propagate changes upwards Invoke AdjustTree on L also passing LL if a split was performed Grow tree taller If node split propagation caused the root to split create a new root whose children are the two resulting nodes Insertion ChooseLeaf ChooseLeaf E new index entry 1 2 3 4 Initialize Set N to be the root node Leaf check If N is a leaf return N Choose subtree Let F be the entry in N whose rectangle FI needs least enlargement to include E Resolve ties by choosing the entry with the rectangle of smallest area Descend until a leaf is reached Set N to be the child node pointed to by Fp and repeat from step 2 SplitNode Node Splitting A full node contains M entries Divide the collection of M 1 entries between 2 nodes Objective Make it as unlikely as possible for the resulting two new nodes to be examined on subsequent searches Heuristic The total area of two covering rectangles after a split should be minimized Total area is larger SplitNode Node Splitting A full node contains M entries Divide the collection of M 1 entries between 2 nodes Objective Make it as unlikely as possible for the resulting two new nodes to be examined on subsequent searches Heuristic The total area of two covering rectangles after a split should be minimized Total area is larger Node Splitting How How to find the minimum area node split 1 2 3 Exhaustive algorithm Quadratic cost algorithm Linear cost algorithm Exhaustive Algorithm Generate all possible groups and choose the best with minimum area Number of possibilities 2 to power of M 1 M 50 Number of possibilities 600 Trillion Exhaustive Algorithm Generate all possible groups and choose the best with minimum area Number of possibilities 2 to power of M 1 M 50 Number of possibilities 600 Trillion US deficit pales Quadratic Cost algorithm A heuristic to find a small area split Cost is quadratic in M and linear in the number of dimensions Pick two of the M 1 entries to be the first elements of the two new groups Choose these in a manner to waste the most area if both were put in the same group Assign remaining entries to groups one …


View Full Document

USC CSCI 585 - RTrees

Loading Unlocking...
Login

Join to view RTrees and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view RTrees and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?