View Full Document

Dynamic Resource-Critical Workflow Scheduling in Heterogeneous Environments



View the full content.
View Full Document
View Full Document

12 views

Unformatted text preview:

Dynamic Resource Critical Workflow Scheduling in Heterogeneous Environments Yili Gong Marlon E Pierce and Geoffrey C Fox Computer School Wuhan University Wuhan HuBei P R China 430079 Email yiligong whu edu cn Community Grids Lab Indiana University Bloomington IN 47404 Email mpierce cs indiana edu Community Grids Lab Department of Computer Science School of Informatics Indiana University Bloomington IN 47404 Email gcf indiana edu Abstract Effective workflow scheduling is a key but challenging issue in heterogeneous environments due to the heterogeneity and dynamism Based on the observations that not all tasks can run on all resources and acquired data transferring and queuing for a resource can be concurrent we propose a dynamic resource critical workflow scheduling algorithm which take into consideration the environmental heterogeneity and dynamism We evaluate its performance by simulations and show that it outperforms other algorithms Index Terms Dynamic Scheduling Resource Critical Scheduling Workflow Heterogeneous Environments I I NTRODUCTION Heterogeneous distributed systems are widely deployed for executing computation and or data intensive parallel applications especially scientific workflows 1 A workflow is a set of ordered tasks that are linked by logic or data dependencies and a workflow management system is employed to define manage and execute these workflow applications 2 The efficient execution of workflows in this kind of environments requires an effective scheduling strategy which decides when and which resources the tasks in a workflow should be submitted to and run on The environment includes both heterogeneous resource and policy The software installation and configuration on resources are different as well as their physical computing capabilities On the other side the administration policies such as access control policies are autonomous and diverse The dynamism means that the resource status e g load waiting time in the queue availability etc changes over time and often uncontrollable Thus the environment requires that the workflow scheduling take into consideration both heterogeneity and dynamism which make the problem very unique and challenging Concerning heterogeneity we find that in practice due to access control policy software version incompatibility or special hardware requirement it is common that some tasks can not run on certain resources With this observation the tasks which can run on every resource are more flexible for scheduling than the resource critical ones which can only run on just a few resources For a resource critical task considering the more resource flexible tasks before and after it as a group when scheduling should be better than scheduling them individually This is the key idea of our resource critical algorithms In terms of the timing of scheduling there are two categories of workflow scheduling approaches static scheduling and dynamic scheduling A static scheduling system makes a schedule before the workflow starts to run based on available resource and environment information while a dynamic scheduling approach schedule a workflow realtime The static approach is comparatively simpler and easier to implement However its performance heavily relies on the accuracy of the resource and environment information Unfortunately it is difficult to precisely predict this information due to resource autonomy and free will user behavior To make full advantage of the known and predicted information as well as to adapt to dynamics of environment dynamic scheduling is introduced After initially scheduling the schedule can be re assigned according to the hitherto workflow execution progress and resource status at runtime Thus we use the resource critical mapping algorithm as a base when resource status changes we using the base algorithm to reschedule the unfinished part of a workflow With respect to the architecture of a scheduling system it could be either centralized or distributed In a centralized workflow scheduling system all the scheduling is fulfilled by a central scheduler While in a decentralized scheduling system there are many distributed brokers The cooperation among the brokers is a tough problem and makes the system complicated Since generally speaking the calculation overhead of a dynamic scheduling algorithm is far less than the execution cost of a workflow we still prefer a centralized approach Analyzing the makespan of a workflow it can be seen that it is composed of tasks execution time data transferring time and waiting time in resource queues To reduce any of these three items is beyond the reach of a workflow management system but it is possible that the data transferring time and the waiting time can be concurrent i e a task can be inserted into a resource waiting queue though its required data are not transferred to the resource yet As long as the data are available when the task can actually get to use the resource it works This is also a principal distinction between our work and other existing work In this paper our main contributions include that we propose a dynamic resource critical workflow scheduling approach and prove that it outperforms other approaches by simulations The rest of the paper is organized as follows The related work is discussed in Section II In Section III the proposed dynamic resource critical workflow scheduling algorithm is described We elaborate the design of experiments and evaluate the performance of our algorithm in Section IV The conclusion is shown in Section V II R ELATED W ORK Extensive work has been done in the field of workflow scheduling in distributed environments The key differentiators of our work in this paper from the related work lies in that 1 we do not assume that a task can run on all resources which greatly extends the meaning of heterogeneity 2 we assume that the data transferring and the waiting time for resources can be concurrent HEFT Heterogeneous Earlier Finish Time 3 is one of the most popular static heuristic and proven to be superior to other alternatives Thus we select it as a base algorithm for comparison In 4 Yu et al proposes a HEFT based adaptive rescheduling algorithm AHEFT It assumes the accuracy of estimation i e communication and computation cost is estimated accurate and task starts and finishes punctually as predicted In contrast our proposed algorithm DRCS does not assume this On the other side in the AHEFT algorithm a task can not start without all


Access the best Study Guides, Lecture Notes and Practice Exams

Loading Unlocking...
Login

Join to view Dynamic Resource-Critical Workflow Scheduling in Heterogeneous Environments and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Dynamic Resource-Critical Workflow Scheduling in Heterogeneous Environments and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?