DOC PREVIEW
Dynamic Resource-Critical Workflow Scheduling in Heterogeneous Environments

This preview shows page 1-2-3 out of 8 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Dynamic Resource-Critical Workflow Scheduling inHeterogeneous EnvironmentsYili Gong∗, Marlon E. Pierce†, and Geoffrey C. Fox‡∗Computer School, Wuhan University, Wuhan, HuBei, P.R.China 430079Email: [email protected]†Community Grids Lab, Indiana University, Bloomington, IN 47404Email: [email protected]‡Community Grids Lab, Department of Computer Science, School of Informatics,Indiana University, Bloomington, IN 47404Email:[email protected]—Effective workflow scheduling is a key but challeng-ing issue in heterogeneous environments due to the heterogeneityand dynamism. Based on the observations that not all taskscan run on all resources and acquired data transferring andqueuing for a resource can be concurrent, we propose a dynamicresource-critical workflow scheduling algorithm which take intoconsideration the environmental heterogeneity and dynamism.We evaluate its performance by simulations and show that itoutperforms other algorithms.Index Terms—Dynamic Scheduling, Resource-CriticalScheduling, Workflow, Heterogeneous Environments.I. INTRODUCTIONHeterogeneous distributed systems are widely deployed forexecuting computation and/or data intensive parallel applica-tions, especially scientific workflows [1]. A workflow is a setof ordered tasks that are linked by logic or data dependenciesand a workflow management system is employed to define,manage and execute these workflow applications [2]. Theefficient execution of workflows in this kind of environmentsrequires an effective scheduling strategy which decides whenand which resources the tasks in a workflow should besubmitted to and run on.The environment includes both heterogeneous resource andpolicy. The software installation and configuration on re-sources are different as well as their physical computingcapabilities. On the other side, the administration policies,such as access control policies, are autonomous and diverse.The dynamism means that the resource status, e.g. load,waiting time in the queue, availability, etc., changes overtime and often uncontrollable. Thus the environment requiresthat the workflow scheduling take into consideration bothheterogeneity and dynamism, which make the problem veryunique and challenging.Concerning heterogeneity, we find that in practice due to ac-cess control policy, software version incompatibility or specialhardware requirement, it is common that some tasks can notrun on certain resources. With this observation, the tasks whichcan run on every resource are more flexible for scheduling thanthe resource-critical ones which can only run on just a fewresources. For a resource-critical task, considering the moreresource-flexible tasks before and after it as a group whenscheduling should be better than scheduling them individually.This is the key idea of our resource-critical algorithms.In terms of the timing of scheduling, there are two categoriesof workflow scheduling approaches: static scheduling anddynamic scheduling. A static scheduling system makes aschedule before the workflow starts to run based on avail-able resource and environment information; while a dynamicscheduling approach schedule a workflow realtime. The staticapproach is comparatively simpler and easier to implement.However, its performance heavily relies on the accuracy ofthe resource and environment information. Unfortunately it isdifficult to precisely predict this information due to resourceautonomy and free will user behavior. To make full advantageof the known and predicted information as well as to adapt todynamics of environment, dynamic scheduling is introduced.After initially scheduling, the schedule can be re-assignedaccording to the hitherto workflow execution progress andresource status at runtime. Thus we use the resource-criticalmapping algorithm as a base, when resource status changes,we using the base algorithm to reschedule the unfinished partof a workflow.With respect to the architecture of a scheduling system,it could be either centralized or distributed. In a centralizedworkflow scheduling system, all the scheduling is fulfilled by acentral scheduler. While in a decentralized scheduling system,there are many distributed brokers. The cooperation amongthe brokers is a tough problem and makes the system compli-cated. Since generally speaking, the calculation overhead ofa dynamic scheduling algorithm is far less than the executioncost of a workflow, we still prefer a centralized approach.Analyzing the makespan of a workflow, it can be seen thatit is composed of tasks’ execution time, data transferring timeand waiting time in resource queues. To reduce any of thesethree items is beyond the reach of a workflow managementsystem, but it is possible that the data transferring time andthe waiting time can be concurrent, i.e. a task can be insertedinto a resource waiting queue though its required data are nottransferred to the resource yet. As long as the data are availablewhen the task can actually get to use the resource, it works.This is also a principal distinction between our work and otherexisting work.In this paper, our main contributions include that we proposea dynamic resource-critical workflow scheduling approach andprove that it outperforms other approaches by simulations.The rest of the paper is organized as follows. The relatedwork is discussed in Section II. In Section III, the proposeddynamic resource-critical workflow scheduling algorithm isdescribed. We elaborate the design of experiments and eval-uate the performance of our algorithm in Section IV. Theconclusion is shown in Section V.II. RELATED WORKExtensive work has been done in the field of workflowscheduling in distributed environments. The key differentiatorsof our work in this paper from the related work lies in that (1)we do not assume that a task can run on all resources, whichgreatly extends the meaning of heterogeneity; (2) we assumethat the data transferring and the waiting time for resourcescan be concurrent.HEFT(Heterogeneous Earlier Finish Time) [3] is one ofthe most popular static heuristic and proven to be superiorto other alternatives. Thus we select it as a base algorithm forcomparison. In [4], Yu et al. proposes a HEFT-based adaptiverescheduling algorithm, AHEFT. It assumes the accuracy of es-timation, i.e. communication and computation cost is estimatedaccurate and task starts and finishes punctually as predicted. Incontrast, our proposed algorithm, DRCS, does not assume this.On the other side, in


Dynamic Resource-Critical Workflow Scheduling in Heterogeneous Environments

Download Dynamic Resource-Critical Workflow Scheduling in Heterogeneous Environments
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Dynamic Resource-Critical Workflow Scheduling in Heterogeneous Environments and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Dynamic Resource-Critical Workflow Scheduling in Heterogeneous Environments 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?