WMU CS 6260 - Integrated Fault Tolerant Techniques Using Parallel Processing - D1820846

Home> Schools> Western Michigan University> Computer Science (CS) > CS 6260> Integrated Fault Tolerant Techniques Using Parallel Processing

DOC PREVIEW

WMU CS 6260 - Integrated Fault Tolerant Techniques Using Parallel Processing

School name Western Michigan University

Course Cs 6260- Advanced Parallel Computations

Pages 49

This preview shows page 1-2-3-23-24-25-26-47-48-49 out of 49 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 49 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 49 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 49 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 49 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 49 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 49 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 49 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 49 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 49 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 49 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 49 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Integrated Fault Tolerant Techniques Using Parallel ProcessingSlide 2IntroductionIntegrated Fault -Tolerant ( IFT)TechniquesIntegrated Fault -Tolerant TechniquesSlide 6Slide 7Slide 8Slide 9Slide 10Dynamic Group Maximum matching (DGMM) AlgorithmSystem ModelSlide 13Dynamic Group Maximum matching (DGMM) algorithmDynamic Group Maximum matching (DGMM) algorithm specificationSlide 16Dynamic Group Maximum matching (DGMM) AlgorithmDGMM ExampleSlide 19Slide 20The Integrated Fault-Tolerant First-Come, First-Served (FCFS) scheduling algorithmSlide 22Slide 23Slide 24Slide 25InFCFS ExampleSlide 27Slide 28Slide 29Slide 30Slide 31Slide 32The Integrated Fault-Tolerant FCFS + Smallest Fit First scheduling algorithmSlide 34Slide 35Slide 36Slide 37Simulation ModelSlide 39Simulation ResultFCFS performanceSlide 42Slide 43Slide 44Slide 45FCFSSFF performanceSlide 47Slide 48Presentation Question1 Nasser AlsaediThe ultimate goal for any computer system design are reliable execution of task and on time delivery of service. To increase system reliability we need fault tolerant and to increase system performance we need parallel processing. My presentation talk about Integrated fault tolerant techniques that tolerate hardware and software fault in parallel computers.Introduction The proposed IFT Techniques is devised for reliable execution of tasks and concurrent on-line system-level fault diagnosis, where both hardware (processors and communication channels) and software are subjected to failure. 4For reliable execution of tasks, different program versions of each task are assigned to a group of processors. Processors are grouped using the DGMM algorithm. A task is released if at least (th + 1) processors agree with each other on the outputs for at least (ts + 1) different program versions and the outputs of all the program versions are the same, 5The proposed work High Reliability Approach: IFT considers the system as a whole, an integration of hardware and software. Here, both hardware failures and software failures are considered in contrast to the most of the existing works that have assumed that only one of them, not both, could be faulty. High Performance Approach: In contrast to most of the existing works that have focused mainly on improving the system reliability and have used system resources lavishly, IFT attempt to maximize the performance concurrently. 6The list concerns for High Reliability and Performance Approach1) Since every system is fault-free most of the time, allocating a task Ti to (2thi + 1) processors to tolerate thi hardware faults, as is done in some of the existing works, is a waste of the system resources. Instead, we allocate initially (thi + 1) processors to the task Ti, which is minimal for tolerating thi hardware faults, and in case of failures we add more processors as needed. 2) A similar procedure is used for tolerating software failures. It is important to realize that software is fault-free most of the time as well. 73) Dynamic Group Maximum Matching (DGMM) algorithm for grouping the system graph. The DGMM algorithm always attempts to maximize the system performance by increasing the number of concurrent tasks in the system( parallel processing). 84) On- Line Fault Diagnoses: In IFT, faults will be diagnosed by running user programs, in contrast to some of the existing works that require running diagnostic programs. By implementing an on-line fault diagnosis, the system will be continuously executing useful application programs instead of executing diagnostic programs for failure detection which add extra overhead and may not providing 100% fault coverage. 9Each task has hardware reliability degree th where th denotes the upper bound for the number of faulty processors and communication channels the system can tolerate with respect to the task Ti  Each task has software reliability degree ts where ts denotes the upper bound for the number of faulty program versions (software reliability degree) that the system can tolerate with respect to a task Ti .10The function of DGMM algorithm is finding group of connected processors and assign these processors to the task. And maximize System performance.For Example, if the task hardware reliability degree th =2DGMM attempts to find group g of connected processors. where g = th + 1 = 2+1=3.11A system is modeled by a graph G ( N, E), where N and E are the nodes set and the edge set of the graph G respectively. A node represents a processor with its local memory while edge represents a communication channel between two neighboring processors.A Task Ti finish execution if there are thi +1 processors agree with each other on tsi +1 program versionsThe proposed ( DGMM) algorithm is a generalization of the group maximum matching concept. In this generalization, the system is partitioned into disjoint groups with different sizes dynamically. At the same time the DGMM algorithm attempts to minimize the time needed to release the correct outputs and maximize the on-line faults diagnoses capabilities. This is achieved by trying to increase the group connectivity. 14Algorithm 1. If | Gi | = 0 then (a) Find a free processor Pj with the lowest degree in the system graph G. In case of a tie, choose a processor randomly. (b) If such a processor Pj exists then i. Gi = Pj. /* add the processor Pj to the group Gi of the task Ti */ ii. Delete the processor Pj with all edges incident to it from the system graph G. 2. While (system graph G is non-empty) and ( | Gi | < gi) and (Gi has free neighboring processors) do 15(a) Find a neighboring processor Pj with the lowest degree among the neighboring group Gi of the task Ti. In case of a tie, choose a neighboring processor with the highest number of links connected to the processors already in the group Gi (b) Gi = Gi + Pj. /* add the processor Pj to the group Gi of the task Ti */ (c) Delete the processor Pj with all edges incident to it from the system graph 16Example Consider a binary 3-cube system shown . Assume that a task T1 with a group size of g1 = 3 is scheduled for execution. Then a task T2 with a group size of g2 = 2. Then a task T3 with a group size of g3 = 5. 17In this section I am going to introduce two integrated fault-tolerant

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-3-23-24-25-26-47-48-49 out of 49 pages.

WMU CS 6260 - Integrated Fault Tolerant Techniques Using Parallel Processing

Sign up for free to view:

Please select your school