Unformatted text preview:

4/15/11%1%Query%Processing%CISC437/637,%Lecture%#14%Ben%Cartere>e%1%Copyright%©%Ben%Cartere>e%Query%EvaluaEon%• To%be%processed%by%the%DBMS,%SQL%queries%need%to%be%1. Parsed%and%translated%into%a%relaEonal,%computable%expression%2. OpEmized%into%an%access%path%and%evalua,on%plan%that%minimizes%computaEon/processing%Eme%3. Evaluated%against%the%actual%data%2%Copyright%©%Ben%Cartere>e%4/15/11%2%Query%EvaluaEon%• Parsing%and%translaEng:%– Check%syntax,%verify%relaEons%– Translate%into%internal%form%corresponding%to%relaEonal%algebra%expression%• OpEmizaEon:%– Formulate%access%path%tree%from%relaEonal%algebra%expression%and%metadata%– Formulate%evaluaEon%plan%• EvaluaEon:%– Execute%evaluaEon%plan%on%indexes%+%tables%to%produce%result%Copyright%©%Ben%Cartere>e% 3%Goals%for%thi s%S ecEon%• Produce%access%paths%and%evaluaEon%plan%– Including%full%algorithmic%specificaEon%• Evaluate%their%cost%(mostly%in%disk%I/O)%– Number%of%disk%seeks%*%average%seek%cost%– Number%of%pages%read%*%average%page%read%cost%– Number%of%pages%wri>en%*%average%page%write%cost%Copyright%©%Ben%Cartere>e% 4%4/15/11%3%Algorithms%for%EvaluaEng%%RelaEonal%Operators%• Many%different%algorithms,%but%three%common%techniques:%– Indexing:%%if%WHERE%clause%(selecEon%or%join),%use%an%index%to%consider%only%a%small%set%of%records%– Itera,on:%%scan%all%records%in%a%table%in%sequence,%or%all%index%data%entries%in%sequence%– Par,,oning:%%use%sorEng%or%hashing%to%split%input%records;%replace%one%expensive%operaEon%with%a%set%of%cheaper%operaEons%on%smaller%inputs%Copyright%©%Ben%Cartere>e% 5%SelecEon%Algorithms%• We%have%discussed%the%cost%of%selecEon%in%the%context%of%indexing%– Scan%cost,%equality%query%cost,%range%query%cost%– SelecEvity,%clustered%vs%unclustered%• That%discussion%applies%to%query%processing%as%well%– Although%now%we%must%calculate%cost%given%a%parEcular%index%type,%rather%than%choosing%an%index%type%to%minimize%cost%• Focus%on%conjuncEons%and%disjuncEons%Copyright%©%Ben%Cartere>e% 10%4/15/11%4%EvaluaEng%ConjuncEonfOnly%SelecEons%• Simple%approach:%– Use%one%index%that%matches%one%of%the%clauses%– Scan%matching%records%for%matches%to%other%clauses%• Possibly%be>er%approach:%– MulEple%indexes%with%record%pointers%– Intersect%sets%of%pointers%before%retrieving%records%Copyright%©%Ben%Cartere>e% 11%SelecEvity%of%Access%Paths%• Selec,vity%is%the%number%of%pages%retrieved%from%disk%by%an%access%path%– Including%index%pages%and%table%pages%• The%most%selec,ve%access%path%is%the%one%that%retrieves%the%fewest%pages%• Each%conjunct%in%the%selecEon%filters%a%table%down%to%just%matching%records;%the%number%of%these%is%the%reduc,on%factor%Copyright%©%Ben%Cartere>e% 12%4/15/11%5%System%Catalog%• The%catalog%stores%metadata%(data%about%data)%for%the%database%– RelaEonal%schema,%informaEon%about%indexes,%informaEon%about%physical%storage%• RelaEon%staEsEcs:%– Ntuples(R),%Npages(R)%• Index%staEsEcs:%– Nkeys(I),%INPages(I),%Iheight(I),%Ilow/Ihigh%• These%can%be%used%to%esEmate%selecEvity%Copyright%©%Ben%Cartere>e% 13%EvaluaEng%SelecEons%with%DisjuncEons%• Analogous%to%conjuncEons:%– Get%record%pointers%from%indexes%– Find%union%of%pointer%sets%before%retrieving%records%• The%catch:%– If%even%one%field%is%not%indexed,%it%will%require%a%full%scan%to%complete%the%union%– It%is%not%usually%wise%to%keep%indexes%on%everything%• DisjuncEons%are%not%easy%to%do%efficiently%Copyright%©%Ben%Cartere>e% 14%4/15/11%6%ProjecEon%Algorithms%• Efficient%computaEon%of%projecEon s%is %onl y%necessary%when%DISTINCT%is%specified%– EliminaEng%duplicates%is%the%hard%part%• Use%parEEoning:%– Sort%records%by%the%projecEon%fields%• SorEng%places%duplicates%next%to%each%other%• Linear%scan%can%be%used%to%eliminate%them%– Or%build%hash%table%on%projecEon%fields%• Duplicates%idenEfied%while%table%is%built%• Not%so%good%if%data%too%large%for%memory%Copyright%©%Ben%Cartere>e% 15%Join%Algorithms%• Joins%are%very%common%but%expensive—efficient%algorithms%are%valuable%• Two%common%algorithms:%– Nestedfloop%join%– Sortfmerge%join%• Simple%unindexed,%uncached%nestedfloop%join%of%R1%and%R2%involves%scanning%R2%for%each%record%in%R1%Copyright%©%Ben%Cartere>e% 16%4/15/11%7%Indexed%NestedfLoop%Join%• If%there’s%an%index%on%the%join%condiEon%in%one%of%the%relaEons,%use%it%to%improve%efficiency%– Specifically,%use%the%indexed%relaEon%in%the%inner%loop%– This%ensures%that%each%record%in%the%inner%relaEon%is%processed%at%most%once%Copyright%©%Ben%Cartere>e% 17%SortfMerge%Join%• Use%idea%of%merge%sort:%– Sort%both%relaEons%by%the%join%condiEon%– Scan%sorted%relaEons%concurrently%– Matching%records%added%to%result%• This%can%be%done%with%no%indexes%• It%can%actually%be%more%efficient%that%an%indexed%nested%loop%– Assuming%sorEng%can%be%done%efficiently%– And%depending%on%selecEvity%Copyright%©%Ben%Cartere>e% 18%4/15/11%8%Complex%Joins%• ConjuncEve%joins:%– A%join%with%many%condiEons%ANDed%together%– Use%nestedfloop,%or%compute%join%on%one%of%the%condiEons%then%scan%remaining%records%• DisjuncEve%joins:%– A%join%with%many%condiEons%ORed%together%– Use%nestedfloop,%or%compute%joins%for%each%condiEon%separately%then%take%set%union%Copyright%©%Ben%Cartere>e% 19%AggregaEon%• Similar%to%duplicate%eliminaEon:%%%– Use%sorEng%on%GROUP%BY%field(s)%• This%puts% all %records%with%that%field%value%adjacent%• Calculate%aggregaEon%funcEon%in%linear%Eme%– Or%hash%on%GROUP%BY%field(s)%• More%efficient%that%sort,%but%requires%more%memory%Copyright%©%Ben%Cartere>e% 20%4/15/11%9%Set%OperaEons%• Also%similar%to%duplicate%eliminaEon%– Sort%on%some%field%• Then%do%something%like%mergefjoin%on%sorted%relaEons%– Hash%on%some%field%• Combine%hash%tables%depending%on%operator%Copyright%©%Ben%Cartere>e% 21%EvaluaEon%Plans%• The%query%parser/translator%turns%a%SQL%query%into%a%relaEonal%algebra%expression%– RelaEonal%algebra%expressions%can%be%visualized%as%trees%with%input%relaEons%at%the%leaves%– EvaluaEon%happens%from%leaves%to%root%•


View Full Document

UD CISC 637 - Query Processing

Download Query Processing
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Query Processing and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Query Processing 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?