DOC PREVIEW
CMU CS 15826 - Lecture

This preview shows page 1-2-3 out of 8 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1CMU SCS15-826: Multimedia Databasesand Data MiningMulti-key and Spatial Access Methods - IC. Faloutsos15-826 Copyright: C. Faloutsos (2005) 2CMU SCSOutlineGoal: ‘Find similar / interesting things’• Intro to DB• Indexing - similarity search• Data Mining15-826 Copyright: C. Faloutsos (2005) 3CMU SCSIndexing - Detailed outline• primary key indexing• secondary key / multi-key indexing• spatial access methods• text• ...15-826 Copyright: C. Faloutsos (2005) 4CMU SCSSec. key indexing• attributes w/ duplicates (eg., EMPLOYEES,with ‘job-code’)• Query types:– exact match– partial match• ‘job-code’ = ‘PGM’ and ‘dept’ =‘R&D’– range queries• ‘job-code’ =‘ADMIN’ and salary < 50K15-826 Copyright: C. Faloutsos (2005) 5CMU SCSSec. key indexing• Query types - cont’ d– boolean• ‘job-code’ =‘ADMIN’ or salary>20K– nn• salary ~ 30K15-826 Copyright: C. Faloutsos (2005) 6CMU SCSSolution?215-826 Copyright: C. Faloutsos (2005) 7CMU SCSSolution?• Inverted indices (usually, w/ B-trees)• Q: how to handle duplicates?Name Job-code Salary DeptSmith PGM 70 R&DJones ADMIN 50 R&D….Tomson ENG 50 SALESsalary-index507015-826 Copyright: C. Faloutsos (2005) 8CMU SCSSolution• A#1: eg., with postings listsName Job-code Salary DeptSmith PGM 70 R&DJones ADMIN 50 R&D….Tomson ENG 50 SALESsalary-index5070postings lists15-826 Copyright: C. Faloutsos (2005) 9CMU SCSSolution• A#2: modify B-tree code, to handle dup’ sName Job-code Salary DeptSmith PGM 70 R&DJones ADMIN 50 R&D….Tomson ENG 50 SALESsalary-index50705015-826 Copyright: C. Faloutsos (2005) 10CMU SCSHow to handle BooleanQueries?Name Job-code Salary DeptSmith PGM 70 R&DJones ADMIN 50 R&D….Tomson ENG 50 SALESsalary-index507050• eg., ‘sal=50 AND job-code=PGM’ ?15-826 Copyright: C. Faloutsos (2005) 11CMU SCSHow to handle BooleanQueries?Name Job-code Salary DeptSmith PGM 70 R&DJones ADMIN 50 R&D… .Tomson ENG 50 SALESsalary-index507050– from indices, find lists of qual. record-ids– merge lists (or check real records)15-826 Copyright: C. Faloutsos (2005) 12CMU SCSSec. key indexing• easily solved in commercial DBMS:create index s al-index on E MPL OYE E(salary);select * from E MPL OYE Ewhere salary > 50 andjob-code = ‘ADMIN’315-826 Copyright: C. Faloutsos (2005) 13CMU SCSSec. key indexing• can create combined indices:create index s j on E MP L OYE E ( s alary,job-code);15-826 Copyright: C. Faloutsos (2005) 14CMU SCSIndexing - Detailed outline• primary key indexing• secondary key / multi-key indexing– main memory: quad-trees– main memory: k-d-trees• spatial access methods• text• ...15-826 Copyright: C. Faloutsos (2005) 15CMU SCSQuad-trees• problem: find cities within 100mi fromPittsburgh• assumption: all fit in main memory• Q: how to answer such queries quickly?15-826 Copyright: C. Faloutsos (2005) 16CMU SCSQuad-trees• A: recursive decomposition of space, e.g.:PGHATLPHL15-826 Copyright: C. Faloutsos (2005) 17CMU SCSQuad-trees• A: recursive decomposition of space, e.g.:PGHATLPHL(30,10)3010SW15-826 Copyright: C. Faloutsos (2005) 18CMU SCSQuad-trees• A: recursive decomposition of space, e.g.:PGHATLPHL(30,10)3010SW204040,20NE415-826 Copyright: C. Faloutsos (2005) 19CMU SCSQuad-trees - search?• find cities with (35<x<45, 15<y<25):PGHATLPHL(30,10)3010SW204040,20NE15-826 Copyright: C. Faloutsos (2005) 20CMU SCSQuad-trees - search?• find cities with (35<x<45, 15<y<25):PGHATLPHL(30,10)3010SW204040,20NE15-826 Copyright: C. Faloutsos (2005) 21CMU SCSQuad-trees - search?• pseudocode:range-query( tree-ptr, range)if (tree-ptr == NULL) exit;if (tree-ptr->point within range){print tree-ptr->point}for each quadrant {if ( range intersects quadrant ) { range-query( tree-ptr->quadrant-ptr, range); }15-826 Copyright: C. Faloutsos (2005) 22CMU SCSQuad-trees - k-nn search?• k-nearest neighbor algo - more complicated:– find ‘good’ neighbors and put them in a stack– go to the most promising quadrant, and update thestack of neighbors– until we hit the leaves15-826 Copyright: C. Faloutsos (2005) 23CMU SCSQuad-trees - discussion• great for 2- and 3-d spaces• several variations, like fixed decomposition:PGHATLPHLPGHATLPHL‘adaptive’‘fixed’middle15-826 Copyright: C. Faloutsos (2005) 24CMU SCSQuad-trees - discussion• but: unsuitable for higher-d spaces (why?)515-826 Copyright: C. Faloutsos (2005) 25CMU SCSQuad-trees - discussion• but: unsuitable for higher-d spaces (why?)• A: 2^d pointers, per node!• Q: how to solve this problem?• A: k-d-trees!15-826 Copyright: C. Faloutsos (2005) 26CMU SCSIndexing - Detailed outline• primary key indexing• secondary key / multi-key indexing– main memory: quad-trees– main memory: k-d-trees• spatial access methods• text• ...15-826 Copyright: C. Faloutsos (2005) 27CMU SCSk-d-trees• Binary trees, with alternating ‘discriminators’PGHATLPHL(30,10)3010SWquad-tree15-826 Copyright: C. Faloutsos (2005) 28CMU SCSk-d-trees• Binary trees, with alternating ‘discriminators’PGHATLPHL(30,10)3010Wk-d-treeE15-826 Copyright: C. Faloutsos (2005) 29CMU SCSk-d-trees• Binary trees, with alternating ‘discriminators’PGHATLPHL(30,10)3010x<=30x>30ATL15-826 Copyright: C. Faloutsos (2005) 30CMU SCSk-d-trees• Binary trees, with alternating ‘discriminators’PGHATLPHL3010(30,10)x<=30x>30ATL2040(40,20)y<=20y>20PHLxy615-826 Copyright: C. Faloutsos (2005) 31CMU SCSIndexing - Detailed outline• primary key indexing• secondary key / multi-key indexing– main memory: quad-trees– main memory: k-d-trees• insertion; deletion• range query; k-nn query• spatial access methods• text• ...15-826 Copyright: C. Faloutsos (2005) 32CMU SCSk-d-trees - insertion• Binary trees, with alternating ‘discriminators’PGHATLPHL3010(30,10)x<=30x>30ATL2040(40,20)y<=20y>20PHLxy15-826 Copyright: C. Faloutsos (2005) 33CMU SCSk-d-trees - insertion• discriminators: may cycle, or ....• Q: which should we put first?PGHATLPHL3010(30,10)x<=30x>30ATL2040(40,20)y<=20y>20PHLxy15-826 Copyright: C. Faloutsos (2005) 34CMU SCSk-d-trees - deletion• Tricky! ‘delete-and-promote’ (or ‘mark asdeleted’ )PGHATLPHL3010(30,10)x<=30x>30ATL2040(40,20)y<=20y>20PHLxy15-826 Copyright: C. Faloutsos (2005) 35CMU SCSk-d-trees - range queryPGHATLPHL3010(30,10)x<=30x>30ATL2040(40,20)y<=20y>20PHL15-826 Copyright: C. Faloutsos (2005) 36CMU SCSk-d-trees - range query• similar to quad-trees: check the root;


View Full Document

CMU CS 15826 - Lecture

Download Lecture
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?