Penn CIS 700 - Concurrent Algorithms - D2634273

Home> Schools> University of Pennsylvania> Cinema Studies (CIS) > CIS 700> Concurrent Algorithms

Penn CIS 700 - Concurrent Algorithms

Pages 11

Download Save

Unformatted text preview:

Concurrent AlgorithmsSumming the elements of an arrayParallel sum and parallel prefix sumSlide 4Batcher’s Bitonic sortMapReduceBasic idea of MapReduceExample: Counting words (Python)Example: Counting words (Java)Example: Average movie ratingsThe EndConcurrent AlgorithmsSumming the elements of an array27 3 15 10 13 18 6 4 10 25 31 10 35 4176Parallel sum and parallel prefix sumIt’s relatively easy to see how to sum up the elements of an array in a parallel fashionThis is a special case of a reduce operation—combining a number of values into a single valueIt’s harder to see how to do a prefix (cumulative) sumFor example, the list [3, 1, 4, 1, 6] to [3, 4, 8, 9, 15]This is a special case of what is sometimes called a scan operationAn example is shown on the next slideThe algorithm is done in two passes:The first pass is “up” the tree, retaining the summandsThe second pass is “down” the treeNote: These two examples are from Principles of Parallel Programming by Calvin Lin and Lawrence Snyder3Summing the elements of an array47 3 15 10 13 18 6 4 10 = 7 + 3 25 = 15 + 10 31 = 13 + 18 10 = 6 + 4 35 = 10 + 25 41 = 31 + 1076 = 35 + 41035 (0+35)003510 (0+10)66 (35+31)03525 (10+15) 7106648 (41+13) 727 10 25 35 48 66 72 76Batcher’s Bitonic sortBatcher’s bitonic sort is a sorting algorithm with the following characteristics:It’s a variation of MergeSortIt’s designed for 2n processorsIt fully occupies all 2n processorsUnlike array sum, which uses fewer processors on each passI’m not going to go through this algorithm—I just want you to be able to say you’ve heard of it 5MapReduceMapReduce is a patented technique perfected by Google to deal with huge data sets on clusters of computersFrom Wikipedia:"Map" step: The master node takes the input, chops it up into smaller sub-problems, and distributes those to worker nodes. A worker node may do this again in turn, leading to a multi-level tree structure. The worker node processes that smaller problem, and passes the answer back to its master node."Reduce" step: The master node then takes the answers to all the sub-problems and combines them in a way to get the output - the answer to the problem it was originally trying to solve.Hadoop is a free Apache version of MapReduce6Basic idea of MapReduceIn MapReduce, the programmer has to write only two functions, and the framework takes care of everything elseThe Map function is applied (in parallel) to each item of data, producing a list of key-value pairsThe framework collects all the lists, and groups the key-value pairs by keyThe Reduce function is applied (in parallel) to each group, returning either a single value, or nothingThe framework collects all the returns 7Example: Counting words (Python)The following Python program counts how many times each word occurs in a set of data, and returns the list of words and their countsdef mapper(key, value): words=key.split() for word in words: Wmr.emit(word, '1')def reducer(key, iter): sum = 0 for s in iter: sum = sum + int(s) Wmr.emit(key, str(sum))8Example: Counting words (Java)* Mapper for word count */class Mapper { public void mapper(String key, String value) { String words[] = key.split(" "); int i = 0; for (i = 0; i < words.length; i++) Wmr.emit(words[i], "1"); }}/* Reducer for word count */class Reducer { public void reducer(String key, WmrIterator iter) { int sum = 0; while (iter.hasNext()) { sum += Integer.parseInt(iter.next()); } Wmr.emit(key, Integer.valueOf(sum).toString()); }}9Example: Average movie ratings#!/usr/bin/env pythondef mapper(key, value): avgRating = float(value) binRating = 0.0 if (0 < avgRating < 1.25): binRating = 1.0 elif (1.25 <= avgRating < 1.75): binRating = 1.5 elif (1.75 <= avgRating < 2.25): binRating = 2.0 elif (2.25 <= avgRating < 2.75): binRating = 2.5 elif (2.75 <= avgRating < 3.25): binRating = 3.0 elif (3.25 <= avgRating < 3.75): binRating = 3.5 elif (3.75 <= avgRating < 4.25): binRating = 4.0 elif (4.25 <= avgRating < 4.75): binRating = 4.5 elif (4.75 <= avgRating < 5.0): binRating = 5.0 else: binRating = 99.0 Wmr.emit(str(binRating), key)#!/usr/bin/env pythondef reducer(key, iter): count = 0 for s in iter: count = count + 1 Wmr.emit(key, str(count))10The

View Full Document


School:
Email:
New Password:
Confirm Password:

Penn CIS 700 - Concurrent Algorithms

Sign up for free to view:

Please select your school