111/03/2005Effective Distributed Effective Distributed Scheduling for Parallel Scheduling for Parallel WorkloadsWorkloadsAndrea Andrea C.DusseauC.Dusseau, , RemziRemziH.ArpaciH.Arpaci, , and David and David E.CullerE.CullerPresented BySankalita Saha11/03/2005OutlineOutlineIntroductionIntroductionMethodologyMethodologyImmediate BlockingImmediate BlockingStatic BlockingStatic BlockingDynamic BlockingDynamic BlockingSensitivity to local schedulerSensitivity to local schedulerConclusionsConclusions211/03/2005IntroductionIntroductionCoschedulingCoschedulingProcesses of a parallel job run at the same time across Processes of a parallel job run at the same time across processorsprocessorsProcessors change job by an externally controlled context Processors change job by an externally controlled context switch occurring simultaneously across all machinesswitch occurring simultaneously across all machinesAdvantagesAdvantagesJob appears to run on one dedicated machineJob appears to run on one dedicated machineDisadvantagesDisadvantagesFaultFault--tolerant and scalable tolerant and scalable coschedulingcoschedulingis hard to design and is hard to design and implement implement Ignores needs of mixed workloads containing I/O intensive or Ignores needs of mixed workloads containing I/O intensive or interactive jobsinteractive jobsBusyBusy--waiting during I/O wastes cycles and reduces throughputwaiting during I/O wastes cycles and reduces throughput11/03/2005IntroductionIntroductionImplicit SchedulingImplicit SchedulingEach local scheduler in a distributed system makes Each local scheduler in a distributed system makes independent decisions that have the bulk effect of independent decisions that have the bulk effect of coordinating the scheduling of cooperating coordinating the scheduling of cooperating processes across processors processes across processors Local scheduling exists on every machine and so no Local scheduling exists on every machine and so no additional implementation is requiredadditional implementation is requiredEach scheduler runs independent of others, hence it is not Each scheduler runs independent of others, hence it is not affected by the failure of othersaffected by the failure of othersTimeTime--sharing, priority based schedulers are tuned for sharing, priority based schedulers are tuned for interactive and I/O intensive processes interactive and I/O intensive processes311/03/2005IntroductionIntroductionImplicit schedulingImplicit schedulingPoor performance of local scheduling occurs due to Poor performance of local scheduling occurs due to lack of simultaneous scheduling across cooperative lack of simultaneous scheduling across cooperative processesprocessesHence communicating processes must be Hence communicating processes must be dynamically identified and coordinated dynamically identified and coordinated ––twotwo--phase phase blockingblockingAuthors show implicit scheduling performance is Authors show implicit scheduling performance is near that of near that of coschedulingcoschedulingwithout requiring global without requiring global explicit coordination explicit coordination 11/03/2005Methodology: Programming ModelMethodology: Programming ModelBulkBulk--Synchronous SPMD (SingleSynchronous SPMD (Single--program program multiplemultiple--data)data)Computation granularity (g)Variation (v)Local computation (c)Local Computation = g +/- v/2http://now.cs.berkeley.edu/Implicit411/03/2005Methodology: Simulation ParametersMethodology: Simulation Parameters3 different communication patterns are 3 different communication patterns are investigatedinvestigatedBARRIERBARRIERNo communicationNo communicationNEWSNEWSGrid communication patternGrid communication patternEach process depend upon its 4 neighborsEach process depend upon its 4 neighborsTRANSPOSETRANSPOSEP read phasesP read phasesOn On iiththread process read process ppreads data from process reads data from process ((p+ip+i) mod P) mod P11/03/2005Methodology: Local SchedulingMethodology: Local SchedulingLocal scheduler closely matches the scheduler of Local scheduler closely matches the scheduler of Unix System V Release 4 (SVR4)Unix System V Release 4 (SVR4)When a job sleeps on an event, it is placed in When a job sleeps on an event, it is placed in blocked list and is not scheduledblocked list and is not scheduledWhen a message arrives for the blocked job, it is When a message arrives for the blocked job, it is given the highest priority and it handles the messagegiven the highest priority and it handles the messageIf the message unblocks the job, then it is given a If the message unblocks the job, then it is given a new priority and placed on the run queue, else it is new priority and placed on the run queue, else it is returned to the blocked listreturned to the blocked list511/03/2005Methodology: Local SchedulingMethodology: Local SchedulingOther characteristics of the schedulerOther characteristics of the schedulerJobJob’’s priority is lowered if it runs for its time quanta without s priority is lowered if it runs for its time quanta without releasing the processor; jobreleasing the processor; job’’s priority is raised if it sleeps s priority is raised if it sleeps frequentlyfrequentlyA starvation timer runs every second; a jobA starvation timer runs every second; a job’’s priority s priority increases if doesnincreases if doesn’’t completes itt completes it’’s time quanta before the s time quanta before the starvation timer expiresstarvation timer expiresThe individual quanta and starvation timers expire The individual quanta and starvation timers expire independently across processorsindependently across processorsIf multiple parallel jobs arrive at the same time, the processesIf multiple parallel jobs arrive at the same time, the processesare randomly ordered in the local scheduling queues are randomly ordered in the local scheduling queues 11/03/2005Immediate BlockingImmediate BlockingGives superior performance for coarse and Gives superior performance for coarse and mediummedium--grain computations coupled with high grain computations coupled with high loadload--imbalanceimbalanceWhen imbalance is high,
View Full Document