DOC PREVIEW
CMU CS 15826 - Indexing - Detailed outline

This preview shows page 1-2-3-4-5-6 out of 19 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1CMU SCS15-826: Multimedia Databasesand Data MiningSpatial Access Methods - IIIC. Faloutsos15-826 Copyright: C. Faloutsos (2005) #2CMU SCSOutlineGoal: ‘Find similar / interesting things’• Intro to DB• Indexing - similarity search• Data Mining15-826 Copyright: C. Faloutsos (2005) #3CMU SCSIndexing - Detailed outline• primary key indexing• secondary key / multi-key indexing• spatial access methods– problem dfn– z-ordering– R-trees– ...• text• ...15-826 Copyright: C. Faloutsos (2005) #4CMU SCSIndexing - more detailedoutline• R-trees– main idea; file structure– algorithms: insertion/split– deletion– search: range, nn, spatial joins– performance analysis– variations (packed; hilbert;...)15-826 Copyright: C. Faloutsos (2005) #5CMU SCSReminder: problem• Given a collection of geometric objects(points, lines, polygons, ...)• organize them on disk, to answer spatialqueries (range, nn, etc)15-826 Copyright: C. Faloutsos (2005) #6CMU SCSR-trees• z-ordering: cuts regions to pieces -> dup.elim.• how could we avoid that?• Idea: try to extend/merge B-trees and k-dtrees215-826 Copyright: C. Faloutsos (2005) #7CMU SCS(first attempt: k-d-B-trees)• [Robinson, 81]: if f is the fanout, split point-set in f parts; and so on, recursively15-826 Copyright: C. Faloutsos (2005) #8CMU SCS(first attempt: k-d-B-trees)• But: insertions/deletions are tricky (splitsmay propagate downwards and upwards)• no guarantee on space utilization15-826 Copyright: C. Faloutsos (2005) #9CMU SCSR-trees• [Guttman 84] Main idea: allow parents tooverlap!– => guaranteed 50% utilization– => easier insertion/split algorithms.– (only deal with Minimum Bounding Rectangles- MBRs)15-826 Copyright: C. Faloutsos (2005) #10CMU SCSR-trees• eg., w/ fanout 4: group nearby rectangles toparent MBRs; each group -> disk pageABCDEFGHIJ15-826 Copyright: C. Faloutsos (2005) #11CMU SCSR-trees• eg., w/ fanout 4:ABCDEFGHIJP1P2P3P4F GD EH I JA B C15-826 Copyright: C. Faloutsos (2005) #12CMU SCSR-trees• eg., w/ fanout 4:ABCDEFGHIJP1P2P3P4P1 P2 P3 P4F GD EH I JA B C315-826 Copyright: C. Faloutsos (2005) #13CMU SCSR-trees - format of nodes• {(MBR; obj-ptr)} for leaf nodesP1 P2 P3 P4A B Cx-low; x-highy-low; y-high...objptr...15-826 Copyright: C. Faloutsos (2005) #14CMU SCSR-trees - format of nodes• {(MBR; node-ptr)} for non-leaf nodesP1 P2 P3 P4A B Cx-low; x-highy-low; y-high...nodeptr...15-826 Copyright: C. Faloutsos (2005) #15CMU SCSR-trees - range search?ABCDEFGHIJP1P2P3P4P1 P2 P3 P4F GD EH I JA B C15-826 Copyright: C. Faloutsos (2005) #16CMU SCSR-trees - range search?ABCDEFGHIJP1P2P3P4P1 P2 P3 P4F GD EH I JA B C15-826 Copyright: C. Faloutsos (2005) #17CMU SCSR-trees - range searchObservations:• every parent node completely covers its‘children’• a child MBR may be covered by more thanone parent - it is stored under ONLY ONEof them. (ie., no need for dup. elim.)15-826 Copyright: C. Faloutsos (2005) #18CMU SCSR-trees - range searchObservations - cont’d• a point query may follow multiple branches.• everything works for any dimensionality415-826 Copyright: C. Faloutsos (2005) #19CMU SCSIndexing - more detailedoutline• R-trees– main idea; file structure– algorithms: insertion/split– deletion– search: range, nn, spatial joins– performance analysis– variations (packed; hilbert;...)15-826 Copyright: C. Faloutsos (2005) #20CMU SCSR-trees - insertion• eg., rectangle ‘X’ABCDEFGHIJP1P2P3P4P1 P2 P3 P4F GD EH I JA B CX15-826 Copyright: C. Faloutsos (2005) #21CMU SCSR-trees - insertion• eg., rectangle ‘X’ABCDEFGHIJP1P2P3P4P1 P2 P3 P4F GD EH I JA B CXX15-826 Copyright: C. Faloutsos (2005) #22CMU SCSR-trees - insertion• eg., rectangle ‘Y’ABCDEFGHIJP1P2P3P4P1 P2 P3 P4F GD EH I JA B CY15-826 Copyright: C. Faloutsos (2005) #23CMU SCSR-trees - insertion• eg., rectangle ‘Y’ : extend suitable parent.ABCDEFGHIJP1P2P3P4P1 P2 P3 P4F GD EH I JA B CYY15-826 Copyright: C. Faloutsos (2005) #24CMU SCSR-trees - insertion• eg., rectangle ‘Y’ : extend suitable parent.• Q: how to measure ‘suitability’ ?515-826 Copyright: C. Faloutsos (2005) #25CMU SCSR-trees - insertion• eg., rectangle ‘Y’ : extend suitable parent.• Q: how to measure ‘suitability’ ?• A: by increase in area (volume) (moredetails: later, under ‘performance analysis’ )• Q: what if there is no room? how to split?15-826 Copyright: C. Faloutsos (2005) #26CMU SCSR-trees - insertion• eg., rectangle ‘W’ABCDEFGHIJP1P2P3P4P1 P2 P3 P4F GD EH I JA B CWKK15-826 Copyright: C. Faloutsos (2005) #27CMU SCSR-trees - insertion• eg., rectangle ‘W’ - focus on ‘P1’ - how tosplit?ABCP1WK15-826 Copyright: C. Faloutsos (2005) #28CMU SCSR-trees - insertion• eg., rectangle ‘W’ - focus on ‘P1’ - how tosplit?ABCP1WK• (A1: plane sweep, until 50% of rectangles)• A2: ‘linear’ split• A3: quadratic split• A4: exponential split15-826 Copyright: C. Faloutsos (2005) #29CMU SCSR-trees - insertion & split• pick two rectangles as ‘seeds’ ;• assign each rectangle ‘R’ to the ‘closest’‘seed’seed1seed2R15-826 Copyright: C. Faloutsos (2005) #30CMU SCSR-trees - insertion & split• pick two rectangles as ‘seeds’ ;• assign each rectangle ‘R’ to the ‘closest’‘seed’• Q: how to measure ‘closeness’ ?615-826 Copyright: C. Faloutsos (2005) #31CMU SCSR-trees - insertion & split• pick two rectangles as ‘seeds’ ;• assign each rectangle ‘R’ to the ‘closest’‘seed’• Q: how to measure ‘closeness’ ?• A: by increase of area (volume)15-826 Copyright: C. Faloutsos (2005) #32CMU SCSR-trees - insertion & split• pick two rectangles as ‘seeds’ ;• assign each rectangle ‘R’ to the ‘closest’‘seed’seed1seed2R15-826 Copyright: C. Faloutsos (2005) #33CMU SCSR-trees - insertion & split• pick two rectangles as ‘seeds’ ;• assign each rectangle ‘R’ to the ‘closest’‘seed’seed1seed2R15-826 Copyright: C. Faloutsos (2005) #34CMU SCSR-trees - insertion & split• pick two rectangles as ‘seeds’ ;• assign each rectangle ‘R’ to the ‘closest’‘seed’• smart idea: pre-sort rectangles according todelta of closeness (ie., schedule easiestchoices first!)15-826 Copyright: C. Faloutsos (2005) #35CMU SCSR-trees - insertion - pseudocode• decide which parent to put new rectangleinto (‘closest’ parent)• if overflow, split to two, using (say,) thequadratic split


View Full Document

CMU CS 15826 - Indexing - Detailed outline

Download Indexing - Detailed outline
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Indexing - Detailed outline and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Indexing - Detailed outline 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?