Lecture'22:'Parallel'Databases'Wednesday,'May'26,'2010'Dan Suciu -- 444 Spring 2010 1Overview'• Parallel'architectures'an d'op erators:'Ch.'20.1'• MapBreduce:'Ch.'20.2'• Semijoin'reducFons,'full'reducers:'Ch.'20.4'– We'covered'this'a'few'lectures'ago'Dan Suciu -- 444 Spring 2010 2Parallel'v.s.'Distributed'Databases'• Parallel'database'system:'– Improve'performance'through'parallel'implementaFon'• Distributed'database'system:'– Data'is'stored'across'several'sites,'each'site'managed'by'a'DBMS'capable'of'running'independently'Dan Suciu -- 444 Spring 2010 3Parallel'DBMS s'• Goal'– Improve'performance'by'execuFng'mulFple'operaFons'in'parallel'• Key'benefit'– Cheaper'to'scale'than'relying'on'a'single'increasingly'more'powerful'processor'• Key'challenge'– Ensure'overhead'and'contenFon'do'not'kill'performance'Dan Suciu -- 444 Spring 2010 4Performance'Metrics''for'Parallel'DBMSs'• Speedup'– More'processors''higher'speed'– Individual'queries'should'run'faster'– Should'do'more'transacFons'per'second'(TPS)'• Scaleup'– More'processors''can'process'more'data'– Batch'scaleup'• Same'query'on'larger'input'data'should'take'the'same'Fme'– TransacFon'scaleup'• NBFmes'as'many'TPS'on'NBFmes'larger'database'• But'each'transacFon'typically'remains'small'Dan Suciu -- 444 Spring 2010 5Linear'v.s.'NonBlinear'Speedup'Dan Suciu -- 444 Spring 2010 # processors (=P) Speedup 6Linear'v.s.'NonBlinear'Scaleup'# processors (=P) AND data size Batch Scaleup ×1 ×5 ×10 ×15 Dan Suciu -- 444 Spring 2010 7Challenges'to''Linear'Speedup'and'Scaleup'• Startup'cost''– Cost'of' starFng'an'operaFon'on'many'processors'• Interference'– ContenFon'for'resources'between'processors'• Skew'– Slowest'processor'becomes'the'boWleneck'Dan Suciu -- 444 Spring 2010 8Architectures'for'Parallel'Databases'• Shared'memor y'• Shared'disk'• Shared'nothing'Dan Suciu -- 444 Spring 2010 9Shared'Memory'InterconnecFon'Network'P' P' P'Global'Shared'Memory'D' D' D'Dan Suciu -- 444 Spring 2010 10Shared'Disk'InterconnecFon'Network'P' P' P'M' M' M'D' D' D'Dan Suciu -- 444 Spring 2010 11Shared'Nothing'InterconnecFon'Network'P' P' P'M' M' M'D' D' D'Dan Suciu -- 444 Spring 2010 12Shared'Nothing'• Most'scalable'architecture'– Minimizes'interference'by'minimizing'resource'sharing'– Can'use'commodity'hardware'• Also'most'difficult'to'program'and'manage'• Processor'='ser ver'='node'• P'='number'of'nodes'Dan Suciu -- 444 Spring 2010 We'will'focus'on'shared'nothing'13QuesFon'• What'exactly'can' we'parallelize'in'a'parallel'DB'?'Dan Suciu -- 444 Spring 2010 14Taxonomy'for'Parallel'Quer y'EvaluaFon'• InterBquery'parallelism'– Each'query'runs' on'o ne'processor'• InterBoperator'parallelism'– A'query'ru ns'on'mulFple'processors'– An'operator'run s'on'one'processor'• IntraBoperator'parallelism'– An'operator'run s'on'mulFple'processors'Dan Suciu -- 444 Spring 2010 We'study'only'intraBoperator'parallelism:'most'scalable'15Horizontal'Data'ParFFoning'• RelaFon'R'split'into'P'chunks'R0,'…,'RPB1,'stored'at'the'P'nodes'• Round'robin:'tuple'ti'to'chunk'(i 'mod'P)'• Hash'based'parFFoning'on'aWribute'A:'– Tuple't'to'chunk'h(t.A)'mod'P'• Range'based'parFFoning'on'aWribute'A:'– Tuple't'to'chunk'i'if'viB1'<'t.A'<'vi'Dan Suciu -- 444 Spring 2010 16Parallel'Sel ecFon'Compute'σA=v(R),'or'σv1<A<v2(R)'• On'a'convenFonal'database:'cost'='B(R)'• Q:'What'is'th e'cost'on'a'parallel'database'with'P'processors' ?'– Round'robin'– Hash'parFFoned'– Range'parFFoned'Dan Suciu -- 444 Spring 2010 17Parallel'Sel ecFon'• Q:'What'is'th e'cost'on'a'parallel'database'with'P'processors' ?'• A:'B(R)'/'P'in'all'cases'• However,'different'processors'do'the'work:'– Round'robin:'all'servers'do'th e'work'– Hash:'one'ser ver'for'σA=v(R),'all'for'σv1<A<v2(R)'– Range:'one'server'onl y'Dan Suciu -- 444 Spring 2010 18Data'ParFFoning'Revisited'What'are'the'pros'and'cons'?'• Round'robin'– Good'load'balance'but'always'needs'to'read'all'the'data'• Hash'based'parFFoning'– Good'load'balance'but'works'only'for'equality'predicates'and'full'scans'• Range'based'parFFoning'– Works'well'fo r'range'predicates'but'can'suffer'from'data'skew'Dan Suciu -- 444 Spring 2010 19Parallel'Group'By:''γA,'sum(B)(R)'• Step'1:'server'i'parFFons'chunk'Ri'using'a'hash'funcFon'h(t.A)'mod'P:'Ri0,'Ri1,'…,'Ri,PB1'''• Step'2:'server'i'sends'parFFon'Rij'to'serve'j'• Step'3:''server'j'computes'γA,'sum(B)'on''R0j,'R1j,'…,'RPB1,j''Dan Suciu -- 444 Spring 2010 20Cost'of'Parallel'Group'By'Recall' convenFonal'cost'=''3B(R)'• Cost'of'Step'1:''B(R)/P''I/O'operaFons'• Cost'of'Step'2:'(PB1)/P'B(R)'blocks'are'sent'– Network'costs'assumed'to'be'much'lower'than'I/O'• Cost'of'Step'3:'2'B(R)/P'– Why'?'– When'can'we'reduce'it'to'0'?'Total'='3B(R)'/'P''+'communi caFon'costs'Dan Suciu -- 444 Spring 2010 21Parallel'Joi n:''R'⋈A=B'S'• Step'1'– For'all'servers'in'[0,k],'ser ver'i'parFFons'chunk'Ri'using'a'hash'funcFon'h(t.A)'mod'P:'Ri0,'Ri1,'…,'Ri,PB1'''– For'all'servers'in'[k+1,P],'ser ver'j'parFFons'chunk'Sj'using'a'hash'funcFon'h(t.A)'mod'P:'Sj0,'Sj1,'…,'Rj,PB1'''• Step'2:''– Server'i'sends'parFFon'Riu'to'server'u'– Server'j'sends'parFFon'Sju'to'server'u'• Steps'3:'Server'u'computes'the'join'of'Riu'with'Sju'Dan Suciu -- 444 Spring 2010 22Cost'of'Parallel'Join'• Step'1:''(B(R)'+'B(S))/ P'• Step'2:''0'– (PB1)/P'(B(R)'+'B(S))'blocks'are'sent,'but'we'assume'network'costs'to'be'<<'disk'I/O'costs'• Step'3:'– 0'if'smaller'table'fits'i n'mai n'memory:'B(S)/p'<=M'– 2(B(R)+B(S))/P'otherwise'Dan Suciu -- 444 Spring 2010 23Parallel'Dataflow'ImplementaFon'• Use'relaFonal'operators'unchanged''• Add'special'split'and'merge'operators'– Handle'data'rouFng ,'bu ffering,'and'flow'control'• Example:'exchange'operator''– Inserted'between'consecuFve'operators'in'the'query'plan'– Can'act'as'either'a'producer'or'consumer'–
View Full Document