Unformatted text preview:

Parallel and Distributed Processing CSE 8380ContentsWhat is a Model?Why Models?Models in Problem SolvingAn InterfaceModels in this classPRAM ModelThe PRAM model and its variations (cont.)Analysis of AlgorithmsAnalysis of Sequential AlgorithmsAnalysis of parallel algorithmAnalysis of parallel algorithm The NC-class and P-completenessSimulating multiple accesses on an EREW PRAMSimulating multiple accesses on an EREW PRAM (cont.)Parallel AlgorithmsSlide 17Enumeration SortSort-CRCW AssumptionsSort-CRCWAlgorithm DetailsComplexityExample: sort (9, 4, 6)Computer Science and EngineeringParallel and Distributed ProcessingCSE 8380February 3, 2005February 3, 2005Session 7Session 7Computer Science and EngineeringContentsAbstract ModelsPRAM ModelComplexity AnalysisIntroduction to Parallel AlgorithmsSortingComputer Science and EngineeringWhat is a Model?According to Webster’s Dictionary, a model is “a description or analogy used to help visualize something that cannot be directly observed.”According to The Oxford English Dictionary, a model is “a simplified or idealized description or conception of a particular system, situation or process.”Computer Science and EngineeringWhy Models?In general, the purpose of Modeling is to capture the salient characteristics of phenomena with clarity and the right degree of accuracy to facilitate analysis and prediction.Megg, Matheson and Tarjan (1995)Computer Science and EngineeringModels in Problem SolvingComputer Scientists use models to help design problem solving tools such as:Fast AlgorithmsEffective Programming EnvironmentsPowerful Execution EnginesComputer Science and EngineeringA model is an interface separating high level properties from low level onesAn InterfaceApplicationsArchitecturesProvidesoperationsRequires implementationMODELComputer Science and EngineeringModels in this classShared Memory ModelDistributed Memory ModelComputer Science and EngineeringPRAM ModelSynchronized Read Compute Write CycleEREWERCWCREWCRCWComplexity: T(n), P(n), C(n)ControlPrivateMemoryP1PrivateMemoryP2PrivateMemoryPpGlobalMemoryComputer Science and EngineeringThe PRAM model and its variations (cont.)There are different modes for read and write operations in a PRAM.Exclusive read(ER)Exclusive write(EW)Concurrent read(CR)Concurrent write(CW)CommonArbitraryMinimumPriorityBased on the different modes described above, the PRAM can be further divided into the following four subclasses.EREW-PRAM modelCREW-PRAM modelERCW-PRAM modelCRCW-PRAM modelComputer Science and EngineeringAnalysis of AlgorithmsSequential AlgorithmsTime ComplexitySpace ComplexityAn algorithm whose time complexity is bounded by a polynomial is called a polynomial-time algorithm. An algorithm is considered to be efficient if it runs in polynomial time.Computer Science and EngineeringAnalysis of Sequential AlgorithmsNPPNP-completeNP-hardThe relationships among P, NP, NP-complete, NP-hardComputer Science and EngineeringAnalysis of parallel algorithmPerformance of a parallel algorithm is expressed in terms of how fast it is and how much resources it uses when it runs. Run time, which is defined as the time during the execution of the algorithmNumber of processors the algorithm uses to solve a problemThe cost of the parallel algorithm, which is the product of the run time and the number of processorsComputer Science and EngineeringAnalysis of parallel algorithmThe NC-class and P-completenessNPPNP-completeNCP-completeNP-hardThe relationships among P, NP, NP-complete, NP-hard, NC, and P-completeComputer Science and EngineeringSimulating multiple accesses on an EREW PRAMBroadcasting mechanism:P1 reads x and makes it known to P2.P1 and P2 make x known to P3 and P4, respectively, in parallel.P1, P2, P3 and P4 make x known to P5, P6, P7 and P8, respectively, in parallel.These eight processors will make x know to another eight processors, and so on.Computer Science and EngineeringSimulating multiple accesses on an EREW PRAM (cont.)Simulating Concurrent read on EREW PRAM with eight processors using Algorithm Broadcast_EREWxxx P1(a)xxxx P2(b)xxxxx P3(c)xxxxxxxxx P5(d)x P4x P6x P7x P8LLLLComputer Science and EngineeringParallel AlgorithmsConstructsProcessor PiForallWhere Do in ParallelOthersComputer Science and EngineeringSimulating multiple accesses on an EREW PRAM (cont.)Algorithm Broadcast_EREWProcessor P1y (in P1’s private memory)  xL[1]  yfor i=0 to log p-1 doforall Pj, where 2i +1 < j < 2i+1 do in parallely (in Pj’s private memory)  L[j-2i]L[j]  yendforendforComputer Science and EngineeringEnumeration SortGiven a list on n numbers a1, a2, …, anWe try to find the position of each element ai in the sorted list by computing the number of elements smaller than itIt ci elements are smaller than ai, then it is the (ci+1)th element in the sorted listIf 2 or more elements have the same value, the element with the largest index in the unsorted list will be considered the largest in the sorted list.Computer Science and EngineeringSort-CRCW AssumptionsTo sort n elements, we use n2 processors (n rows and n columns)Pi,j  processor in row i, column jConcurrent write  sum of all valuesA[1..n] array of elements in global memoryC[1..n]  array to store number of elements smaller than every element in AComputer Science and EngineeringSort-CRCWTwo steps1. Each row of processors i computes C[i], the number of elements smaller than A[i]. Each processor Pi,j compares A[i] and A[j], then updates C[i] appropriately2. The first row in each Pi,1 row places places A[i] in its proper position in the sorted list (C[i] + 1)Computer Science and EngineeringAlgorithm DetailsDetail of two step Algorithm/* step 1 */forall Pi,j, where 1 < i, j<n do in parallelif A[i] > A[j] or (A[i] = A[j] and i > j) thenC[i]  1elseC[i]  0endifendfor/* step 2 */forall Pi,l, where 1 < i<n do in parallelA[C[i] +1]  A [i] endforComputer Science and EngineeringComplexity Run time: T(n) = O(1)Number of processors: P(n) = n2Cost: c(n) = O(n2)Is it cost optimal?No! (sequential sort can be done in O(n log n)Computer Science and EngineeringExample: sort (9, 4, 6)P1,1P1,2P1,3649A =9 & 9 9 & 4 9 & 6P2,1P2,2P2,34 & 9 4 & 4 4 & 6P3,1P3,2P3,36 & 9 6 & 4 6 & 6102C =964A =Concurrent write SUMT(n) = O(1)P(n) = n2C(n) = T(n) * P(n) =


View Full Document

SMU CSE 8383 - Lecture Notes

Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?