DOC PREVIEW
USC CSCI 585 - Session14

This preview shows page 1-2-3-24-25-26 out of 26 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 26 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CSCI585CSCI585Spatial Index StructuresInstructor: Cyrus ShahabiCSCI585CSCI585Outline• Grid File• Z-ordering• Hilbert Curve• Quad Tree– PM– PR• R Tree (next session)– R* Tree– R+ TreeCSCI585CSCI585Grid File• Hashing methods for multidimensional points (extension of Extensible hashing)• Idea: Use a grid to partition the space each cell is associated with one page• Two disk access principle (exact match)CSCI585CSCI585Grid File• Start with one bucket for the whole space.• Select dividers along each dimension. Partition space into cells • Dividers cut all the way.CSCI585CSCI585Grid File• Each cell corresponds to 1 disk page.• Many cells can point to the same page.• Cell directory potentially exponential in the number of dimensionsCSCI585CSCI585Grid File Implementation• Dynamic structure using a grid directory– Grid array: a 2 dimensional array with pointers to buckets (this array can be large, disk resident) G(0,…, nx-1, 0, …, ny-1)– Linear scales: Two 1 dimensional arrays that used to access the grid array (main memory) X(0, …, nx-1), Y(0, …, ny-1)CSCI585CSCI585ExampleLinear scale X Linear scaleY Grid DirectoryBuckets/Disk BlocksCSCI585CSCI585The scales do not need to be uniformX1X2X3X4Y1Y2Y3ScalesGrid-filebucketsCSCI585CSCI585Grid File Search• Exact Match Search: at most 2 I/Os assuming linear scales fit inmemory.– First use liner scales to determine the index into the cell directory– access the cell directory to retrieve the bucket address (may cause 1 I/O if cell directory does not fit in memory)– access the appropriate bucket (1 I/O)• Range Queries:– use linear scales to determine the index into the cell directory.– Access the cell directory to retrieve the bucket addresses of buckets to visit.– Access the buckets.CSCI585CSCI585Grid File Insertions• Determine the bucket into which insertion must occur.• If space in bucket, insert.• Else, split bucket– how to choose a good dimension to split?– ans: create convex regions for buckets.• If bucket split causes a cell directory to split do so and adjust linear scales.• insertion of these new entries potentially requires a complete reorganization of the cell directory---expensive!!!CSCI585CSCI585Grid File Deletions• Deletions may decrease the space utilization. Merge buckets• We need to decide which cells to merge and a merging threshold• Buddy system and neighbor system– A bucket can merge with only one buddy in each dimension– Merge adjacent regions if the result is a rectangleCSCI585CSCI585One dimensional embedding• Example: Z-Ordering or bit-interleaving• Basic assumption: Finite precision in the representation of each coordinate, K bits (2Kvalues)• The address space is a square (image) and represented as a 2Kx 2Karray• Each element is called a pixelCSCI585CSCI585Z-Ordering– Find a linear order for the cells of the grid while maintaining “locality” (i.e., cells close to each other in space are also close to each other in the linear order)– Define this order recursively for a grid that is obtained by hierarchical subdivision of space0001 1110100100 01 10 11000110111110CSCI585CSCI585Z-ordering• Impose a linear ordering on the pixels of the image  1 dimensional problem00 01 10 1100011011ABZA= shuffle(xA, yA) = shuffle(“01”, “11”)= 0111 = (7)10ZB = shuffle(“01”, “01”) = 0011CSCI585CSCI585Z-ordering• Given a point (x, y) and the precision K find the pixel for the point and then compute the z-value• Given a set of points, use a B+-tree to index the z-values• A range (rectangular) query in 2-d is mapped to a set of ranges in 1-dCSCI585CSCI585Queries• Find the z-values that contained in the query and then the ranges 00 01 10 1100011011QArange [4, 7]QAQBQBranges [2,3] and [8,9]CSCI585CSCI585Hilbert Curve• We want points that are close in 2d to be close in the 1d• Note that in 2d there are 4 neighbors for each point where in 1d only 2.• Z-curve has some “jumps” that we would like to avoid• Hilbert curve avoids the jumpsCSCI585CSCI585Hilbert Curve- example• It has been shown that in general Hilbert is better than the other space filling curves for retrieval *• Hi (order-i) Hilbert curve for 2ix2iarray* H. V. Jagadish: Linear Clustering of Objects with Multiple Atr* H. V. Jagadish: Linear Clustering of Objects with Multiple Atributes. ACM SIGMOD Conference 1990: 332ibutes. ACM SIGMOD Conference 1990: 332--3423420114321347856910111215N=1 N=2 N=3 N=4H:[0,2N-1]d [0,2Nd-1]CSCI585CSCI585Quad Trees• Region Quadtree– The blocks are required to be disjoint– Have standard sizes (squares whose sides are power of two)– At standard locations –Based on successive subdivision of image array into four equal-size quadrants– If the region does not cover the entire array, subdivide into quadrants, sub-quadrants, etc.– A variable resolution data structureCSCI585CSCI585Example of Region Quadtree12 3456789101112131415 16171819AB C E213 4 5 6 11 12D13 14 19F15 16 17 187 8 9 10NWNESWSE331111CSCI585CSCI585PR Quadtree• PR (Point-Region)quadtree• Regular decomposition (similar to Region quadtree)• Independent of the order in which data points are inserted into it•: if two points are very close, decomposition can be very deepCSCI585CSCI585Example of PR Quadtree(0,0)(100,0)(100,100)(0,100)Seattle(1,55)Toronto(62,77)Buffalo(82,65)Denver(5,45)Chicago(35,42)Omaha(27,35)Mobile(52,10)Atlanta(85,15)Miami(90,5)ASeattle(1,55)Toronto(62,77)Buffalo(82,65)Denver(5,45)Chicago(35,42)Omaha(27,35)Mobile(52,10)Atlanta(85,15)Miami(90,5)BCEDFSubdivide into quadrants until the two points are located in different regionsCSCI585CSCI585PM Quadtree• PM (Polygonal-Map)quadtree family– PM1 quadtree, PM2 quadtree, PM3 quadtree, PMR quadtree, … etc.• PM1 quadtree– Based on regular decomposition of space– Vertex-based implementation– Criteria• At most one vertex can lie in a region represented by a quadtree leaf• If a region contains a vertex, it can contain no partial-edge that does not include that vertex• If a region contains no vertices, it can contain at most one partial-edgeCSCI585CSCI585PM QuadtreePM1 quadtreePM2 quadtreePM3 quadtreeCSCI585CSCI585Example of PM1 Quadtree(0,0)(100,0)(100,100)(0,100)• Each node in a PM quadtree is a collection of partial edges (and a vertex)• Each point record has two field (x,y)• Each partial edge has four field (starting_point,


View Full Document

USC CSCI 585 - Session14

Download Session14
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Session14 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Session14 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?