Introduction to Database Systems CSE 444, Winter 2011 Lecture 24: Parallel Databases h"p://www.cs.washington.edu/educa3on/courses/cse444/11wi/7777Version7March74,720117Outlook 27Overview Parallel7architectures7and7operators:7Ch.720.17 Map@reduce:7Ch.720.27 Semijoin7reduc3ons,7full7reducers:7Ch.720.47 We7covered7this7a7few7lectures7ago737Parallel v.s. Distributed Databases Parallel7database7system:7 Improve7performance7through7parallel7implementa3on7 Distributed7database7system:7 Data7is7stored7across7several7sites,7each7site7managed7by7a7DBMS7capable7of7running7independently747Parallel DBMSs 57 Goal7 Improve7performance7by7execu3ng7mul3ple7opera3ons7in7parallel77 Key7benefit7 Cheaper7to7scale7than7relying7on7a7single7increasingly7more7powerful7processor7 Key7challenge7 Ensure7overhead7and7conten3on7do7not7kill7performance7Performance Metrics for Parallel DBMSs Speedup77 More7processors7→7higher7speed7 Individual7queries7should7run7faster7 Should7do7more7transac3ons7per7second7(TPS)7 Fixed7problem7size7overall,7vary7#7of7processors7("strong7scaling”)7 Scaleup7 More7processors7→7can7process7more7data7 Fixed7problem7size7per(processor,7vary7#7of7processors(("weak7scaling”)7 Batch7scaleup7 Same7query7on7larger7input7data7should7take7the7same73me7 Transac3on7scaleup7 N@3mes7as7many7TPS7on7N@3mes7larger7database7 But7each7transac3on7typically7remains7small767Linear v.s. Non-linear Speedup 77#7processors7(=P)7Speedup7Linear v.s. Non-linear Scaleup 87#7processors7(=P)7AND7data7size77Batch7Scaleup7×17 ×57×107×157Challenges to Linear Speedup and Scaleup Startup7cost77 Cost7of7star3ng7an7opera3on7on7many7processors7 Interference7 Conten3on7for7resources7between7processors7 Skew7 Slowest7processor7becomes7the7bo"leneck797 Shared7memory7 Shared7disk7 Shared7nothing7107Architectures for Parallel DatabasesShared Memory 117Interconnec3on7Network7P7 P7Global7Shared7Memory7D7 D7 D7P7Shared Disk 127Interconnec3on7Network7P7 P7 P7M7 M7 M7D7 D7 D7Shared Nothing 137Interconnec3on7Network7P7 P7 P7M7 M7 M7D7 D7 D7Shared Nothing Most7scalable7architecture7 Minimizes7interference7by7minimizing7resource7sharing7 Can7use7commodity7hardware7 Also7most7difficult7to7program7and7manage7 Processor7=7server7=7node77 “Processor”7!=7core7 P7=7number7of7nodes7147We7will7focus7on7shared7nothing7Important7ques3on:7what7exactly7can7we7actually7parallelize7in7a7parallel7database?7 Inter@query7parallelism7 Each7query7runs7on7one7processor7 Inter@operator7parallelism7 A7query7runs7on7mul3ple7processors7 An7operator7runs7on7one7processor7 Intra@operator7parallelism7 An7operator7runs7on7mul3ple7processors7157Taxonomy for Parallel Query EvaluationHorizontal Data Partitioning Rela3on7R7split7into7P7chunks7R0,7…,7RP@1,7stored7at7the7P7nodes7 37ways7to7horizontally7par33on7a7rela3on:7 Round7robin:7tuple7ti7to7chunk7(i7mod7P)7 Hash7based7par33oning7on7a"ribute7A:7 Tuple7t7to7chunk7h(t.A)7mod7P7 Range7based7par33oning7on7a"ribute7A:7 Tuple7t7to7chunk7i7if7vi@17<7t.A7<7vi7167Horizontal Data Partitioning All7three7choices7are7just7special7cases:7 For7each7tuple,7compute7bin(=(f(t)( Different7proper3es7of7the7func3on7f7determine7hash7vs.7range7vs.7round7robin7vs.7anything7177Parallel Selection Compute7σA=v(R),7or7σv1<A<v2(R)7 On7a7conven3onal7database:7cost7=7B(R)7(cost7is7here7~73me,7and7in7worst7case)7 Q:7What7is7the7cost7on7a7parallel7database7with7P7processors7?7 Round7robin7 Hash7par33oned7 Range7par33oned7187Parallel Selection Q:7What7is7the7cost7on7a7parallel7database7with7P7processors7?7(cost7is7here7~73me,7and7in7worst7case)7 σ:7B(R)7/7P7in7all7cases7 However,7different7processors7do7the7work:7 Round7robin:7all7servers7do7the7work7 Hash:7one7server7for7σA=v(R),7all7for7σv1<A<v2(R)7 Range:7one7server7only7197Data Partitioning Revisited 207What7are7the7pros7and7cons7?7 Round7robin7 Good7load7balance7but7always7needs7to7read7all7the7data7 Hash7based7par33oning7 Good7load7balance7but7works7only7for7equality7predicates7and7full7scans77 Range7based7par33oning7 Works7well7for7range7predicates7but7can7suffer7from7data7skew7Parallel Group By: γA, sum(B)(R) Step71:7server7i7par33ons7chunk7Ri7using7a7hash7func3on7h(t.A)7mod7P:7Ri0,7Ri1,7…,7Ri,P@1777 Step72:7server7i7sends7par33on7Rij7to7serve7j7 Step73:77server7j7computes7γA,7sum(B)7on77R0j,7R1j,7…,7RP@1,j77217Parallel Group By: γA, sum(B)(R): Cost Recall7conven3onal7cost7=773B(R)7 Cost7of7Step71:77B(R)/P77I/O7opera3ons7 Cost7of7Step72:7(P@1)/P7B(R)7blocks7are7sent,7no7cost7 Network7costs7assumed7to7be7much7lower7than7I/O7 Cost7of7Step73:727B(R)/P7 Why7?7 When7can7we7reduce7it7to707?7Total7=73B(R)7/7P777(communica3on7costs7ignored)7227Parallel Group By: γA, sum(B)(R) Can7we7do7be"er?7 Sum?7 Count?7 Avg?7 Max?7 Median?7237Parallel Group By: γA, sum(B)(R) Sum(B)7=7Sum(B0)7+7Sum(B1)7+7…7+7Sum(Bn)7 Count(B)7=7Count(B0)7+7Count(B1)7+7…7+7Count(Bn)7 Max(B)7=7Max(Max(B0)7+7Max(B1)7+7…7+7Max(Bn))7 Avg(B)7=7Sum(B)7/7Count(B)7 Median(B)7=77247distribu3ve7(bad7name)7algebraic7holis3c7Source:7Naming7from:7[Gray7et7al.:7DataCube,7ICDE71996]7Parallel Join: R ⋈A=B S Step717 For7all7servers7in7[0,k],7server7i7par33ons7chunk7Ri7using7a7hash7func3on7h(t.A)7mod7P:7Ri0,7Ri1,7…,7Ri,P@1777 For7all7servers7in7[k+1,P],7server7j7par33ons7chunk7Sj7using7a7hash7func3on7h(t.A)7mod7P:7Sj0,7Sj1,7…,7Rj,P@17777 Step72:77 Server7i7sends7par33on7Riu7to7server7u7 Server7j7sends7par33on7Sju7to7server7u77 Steps73:7Server7u7computes7the7join7of7Riu7with7Sju7257Parallel Join: R ⋈A=B S: Cost Step71:77(B(R)7+7B(S))/P7 Step72:7707 (P@1)/P7(B(R)7+7B(S))7blocks7are7sent,7but7we7assume7network7costs7to7be7<<7disk7I/O7costs7 Step73:7 07if7smaller7table7fits7in7main7memory:7B(S)/p7<=M7 4(B(R)+B(S))/P7otherwise77
View Full Document