Introduction to Database Systems CSE 444, Winter 2011 Lecture 20: Operator Algorithms Version(March(15,(2011(h2p://www.cs.washington.edu/educa<on/courses/cse444/11wi/((((Where we are / and where we go 2(Why Learn About Operator Algorithms? Implemented(in(commercial(DBMSs( DBMSs(implement(different(subsets(of(known(algorithms( Good(algorithms(can(greatly(improve(performance( Need(to(know(about(physical(operators(to(understand(query(op<miza<on(3(h2p://www.cs.washington.edu/educa<on/courses/cse444/11wi/((((Cost Parameters In(database(systems(the(data(is(on(disk( Cost%=%total%number%of%I/Os( Parameters:( B(R)(=(#(of(blocks((i.e.,(pages)(for(rela<on(R( T(R)(=(#(of(tuples(in(rela<on(R( V(R,%a)(=(#(of(dis<nct(values(of(a2ribute(a( When(a(is(a(key,(V(R,a)(=(T(R)( When(a(is(not(a(key,(V(R,a)(can(be(anything(<(T(R)( M(=(#(of(max.(pages(in(main(memory(4(h2p://www.cs.washington.edu/educa<on/courses/cse444/11wi/((((Cost Cost(of(an(opera<on(=(number(of(disk(I/Os(to( Read(the(operands( Compute(the(result( Cost(of(wri<ng(the(result(to(disk(is(not$included( Need(to(count(it(separately(when(applicable(5(h2p://www.cs.washington.edu/educa<on/courses/cse444/11wi/((((Cost of Scanning a Table Result(may(be(unsorted:(( (B(R)( Result(needs(to(be(sorted:(3(B(R)( We(will(discuss(sor<ng(later(6(h2p://www.cs.washington.edu/educa<on/courses/cse444/11wi/((((Outline for Today Join(operator(algorithms( One]pass(algorithms((Sec.(15.2(and(15.3)( Index]based(algorithms((Sec(15.6)( Two]pass(algorithms((Sec(15.4(and(15.5)(Note(about(readings:(( In(class,(we(will(discuss(only(join(operator(algorithms(((because(other(operators(are(easier)( Read(the(book(to(get(more(details(about(these(algos(and(about(algots(for(other(operators(7(h2p://www.cs.washington.edu/educa<on/courses/cse444/11wi/((((Basic Join Algorithms Logical(operator:( Product(pname,(cname)(⋈(Company(cname,(city)( Propose(three(physical(operators(for(the(join,(assuming(the(tables(are(in(main(memory:( Hash(join( Nested(loop(join( Sort]merge(join(8(h2p://www.cs.washington.edu/educa<on/courses/cse444/11wi/((((1. Hash Join Hash(join:( Scan(R,(build(buckets(in(main(memory( Then(scan(S(and(join( Cost:(B(R)(+(B(S)( One]pass(algorithm(when(B(R)(≤(M( By(“one(pass”,(we(mean(that(the(operator(reads(its(operands(only(once.(It(does(not(write(intermediate(results(back(to(disk.(9(h2p://www.cs.washington.edu/educa<on/courses/cse444/11wi/((((R(⋈(S(1. Hash Join Example 10(Patient Insurance 1( ‘Bob’( ‘Sea2le’(2( ‘Ela’( ‘Evere2’(3( ‘Jill’( ‘Kent’(4( ‘Joe’( ‘Sea2le’(Pa<ent%2( ‘Blue’( 123(4( ‘Prem’( 432(Insurance%4( ‘Prem’( 343(3( ‘GrpH’( 554(h2p://www.cs.washington.edu/educa<on/courses/cse444/11wi/((((Patient(pid, name, address) Insurance(pid, provider, policy_nb) Two(tuples((per(page(Disk(...% ...%1. Hash Join Example 11(1(2(3( 4(Pa<ent(Insurance(8( 5(9(6(Disk(Memory(M(=(21(pages(h2p://www.cs.washington.edu/educa<on/courses/cse444/11wi/((((Patient Insurance 2( 4(4( 3(2( 8(8( 9(6( 6(1( 3(Showing(only(pid;(note(a(page(contains(2(tuples(1. Hash Join Example 12(Step(1:(Scan(Pa<ent(and(create(hash(table(in(memory(1(2(3( 4(Pa<ent(2( 4(Insurance(4( 3(8( 5(9(6(2( 8(8( 9(6( 6(1( 3(Disk(Memory(M(=(21(pages(Hash(h:(pid(%(5(Input(buffer(1( 2( 4(3( 9(6( 8(5(1(2(h2p://www.cs.washington.edu/educa<on/courses/cse444/11wi/((((Patient Insurance1. Hash Join Example 13(Step(2:(Scan(Insurance(and(probe(into(hash(table(1(2(3( 4(Pa<ent(Insurance(8( 5(9(6(Disk(Memory(M(=(21(pages(Hash(h:(pid(%(5(Input(buffer(1( 2( 4(3( 9(6( 8(5(1(2(2( 4(Output(buffer(2( 2(h2p://www.cs.washington.edu/educa<on/courses/cse444/11wi/((((Patient Insurance 2( 4(4( 3(2( 8(8( 9(6( 6(1( 3(Step(1:(Scan(Pa<ent(and(create(hash(table(in(memory(write(to(disk(1. Hash Join Example 14(1(2(3( 4(Pa<ent(Insurance(8( 5(9(6(Disk(Memory(M(=(21(pages(Hash(h:(pid(%(5(Input(buffer(1( 2( 4(3( 9(6( 8(5(1(2(2( 4(Output(buffer(4( 4(h2p://www.cs.washington.edu/educa<on/courses/cse444/11wi/((((Patient Insurance 2( 4(4( 3(2( 8(8( 9(6( 6(1( 3(Step(2:(Scan(Insurance(and(probe(into(hash(table(Step(1:(Scan(Pa<ent(and(create(hash(table(in(memory(1. Hash Join Example 15(1(2(3( 4(Pa<ent(Insurance(8( 5(9(6(Disk(Memory(M(=(21(pages(Hash(h:(pid(%(5(Input(buffer(1( 2( 4(3( 9(6( 8(5(1(2(4( 3(Output(buffer(4( 4(Keep(going(un<l(read(all(of(Insurance(Cost:(B(R)(+(B(S)(h2p://www.cs.washington.edu/educa<on/courses/cse444/11wi/((((Patient Insurance 2( 4(4( 3(2( 8(8( 9(6( 6(1( 3(Step(2:(Scan(Insurance(and(probe(into(hash(table(Step(1:(Scan(Pa<ent(and(create(hash(table(in(memory(1. Hash Join Details 16(Open(()({((H(=(newHashTable(();((R.Open(();((x(=(R.GetNext(();((while((x(!=(null)({((( (H.insert(x);((( (x(=(R.GetNext(();(((}((R.Close(();((S.Open(();((buffer(=([(];(}(h2p://www.cs.washington.edu/educa<on/courses/cse444/11wi/((((x1. Hash Join Details 17(h2p://www.cs.washington.edu/educa<on/courses/cse444/11wi/((((GetNext(()({((while((buffer(==([(])({(((((((((x(=(S.GetNext(();(((((((((if((x==Null)(return(NULL;(((((((((buffer(=(H.find(x);(}(z(=(buffer.first(();(buffer(=(buffer.rest(();(return(z;(}(x1. Hash Join Details 18(h2p://www.cs.washington.edu/educa<on/courses/cse444/11wi/((((Close(()({((release(memory((H,(buffer,(etc.);((S.Close(()(}(x2. Nested Loop Joins Tuple]based(nested(loop( R(is(the(outer(rela<on,(S(is(the(inner(rela<on(((( Cost:(B(R)(+(T(R)(B(S)( One]pass(only(over(outer(rela<on( But(S(is(read(many(<mes(19(h2p://www.cs.washington.edu/educa<on/courses/cse444/11wi/((((for each tuple r in R do for each tuple s in S do if r and s join then output (r,s) R(⋈(S(x2. Page-at-a-time Refinement Cost:(B(R)(+(B(R)(B(S)(20(h2p://www.cs.washington.edu/educa<on/courses/cse444/11wi/((((for each page of tuples r in R do for each page of tuples s in S do for all pairs of tuples if r and s join then output (r,s) T(R)(1(2(2. Nested Loop Example 21(Input(buffer(for(Pa<ent(Output(buffer(2( 2(Input(buffer(for(Insurance(2( 4(h2p://www.cs.washington.edu/educa<on/courses/cse444/11wi/((((1(2(3( 4(Pa<ent(Insurance(8( 5(9(6(Disk(2( 4(4( 3(2( 8(8( 9(6( 6(1( 3(Patient Insurance2. Nested Loop Example 22(Input(buffer(for(Pa<ent(1(2(Output(buffer(Input(buffer(for(Insurance(4(
View Full Document