MapReduce Dean and Ghemawat MapReduce Simplified Data Processing on Large Clusters Communications of the ACM Vol 51 No 1 January 2008 Shahram Ghandeharizadeh Computer Science Department University of Southern California A Shared Nothing Framework Shared nothing architecture consisting of thousands of nodes A node is an off the shelf commodity PC Yahoo s Pig Latin Google s Map Reduce Framework Google s Bigtable Data Model Google File System Overview Map Reduce Hadoop A programming model to make parallelism transparent to a programmer Programmer specifies a map function that processes a key value pair to generate a set of intermediate key value pairs a reduce function to merge all intermediate values associated with the same intermediate key Divides the problem into smaller intermediate key value sub problems Solve each sub problem Final results might be stored across R files Run time system takes care of Partitioning the input data across nodes Scheduling the program s execution Node failures Coordination among multiple nodes Example Counting word occurrences Input document is NameList and its content is Jim Shahram Betty Jim Shahram Jim Shahram Desired output Jim 3 Shahram 3 Betty 1 How Map String doc name String doc content doc name is document name NameList doc content is document content Jim Shahram For each word w in value EmitIntermediate w 1 Map NameList Jim Shahram Betty emits Jim 1 Shahram 1 Betty 1 A hash function may split different tokens across M different Worker processes Reduce String key Iterator values key is a word values is a list of counts Int result 0 For each v in values result ParseInt v Emit AsString result Other Examples Distributed Grep Count of URL accesses Map function processes logs of web page requests and outputs URL 1 Reduce function adds together all values for the same URL emitting URL total count pairs Reverse Web Link graph e g all URLs with reference to http dblab usc edu Map function emits a line if it matches a supplied pattern Reduce function is an identity function that copies the supplied intermediate data to the output Map function outputs tgt src for each link to a tgt in a page named src Reduce concatenates the list of all src URLS associated with a given tgt URL and emits the pair tgt list src Inverted Index e g all URLs with 585 as a word Map function parses each document emitting a sequence of word doc ID Reduce accepts all pairs for a given word sorts the corresponding doc IDs and emits a word list doc ID pair Set of all output pairs forms a simple inverted index MapReduce Input R r1 r2 rn user provided functions M and R M ri K1 V1 K2 V2 Jim 1 Shahram 1 Betty 1 Jim 1 1 1 Shahram 1 1 1 Betty 1 R Ki ValueSet Ki R ValueSet Jim 3 Shahram 3 Betty 1 Implementation Target environment Commodity PCs connected using a switched Ethernet GFS manages data stored across PCs A scheduling system accepts jobs submitted by users each job consists of a set of tasks and the scheduler maps tasks to a set of available machines within a cluster Execution Map invocations are distributed across multiple machines by automatically partitioning the input data into a set of M splits Reduce invocations are distributed by paritioning the intermediate key space into R pieces using a hash function hash key mod R R and the partitioning function are specified by the programmer Output of Execution R output files one per reduce task with file name specified by the programmer Typically programmers do not combine R output files into one file they pass these as input to another MapReduce call or use them with another distributed application that is able to deal with input that is partitioned into multiple files Execution Important details Output of Map task is stored on the local disk of the machine the task is executing on A Map task produces R such files on its local disk similar to your Homework 2 where R 101 Output of Reduce task is stored in the GFS High availability via replication The filename of the output produced by a reduce task is deterministic When a reduce task completes the reduce worker atomically renames its temporary output file to the final output file If the same reduce task executes on multiple machines multiple renames calls will be executed for the same output file Master Propagates location of intermediate file regions from map tasks to reduce tasks For each completed map task master stores the location and sizes of the R produced intermediate files Pushes location of the R produced intermediate files to the workers with in progress reduce tasks For each map task and reduce task master stores the possible states idle in progress or completed Master takes the location of input files GFS and their replicas into account It strives to schedule a map task on a machine that contains a replica of the corresponding input file or near it Minimize contention for the network bandwidth Termination condition All map and reduce tasks are in the completed state Worker Failures Failure detection mechanism Master pings workers periodically An in progress Map or Reduce task on a failed worked is reset to idle and eligible for rescheduling Completed Map task on a failed worker must also be re executed because its output are stored on the local disk Master Failure Abort the MapReduce computation Client may check for this condition and retry the MapReduce operation Alternative lets the master checkpoint its data structures enabling a new instance to resume from the last checkpoint state Execution Important details Output of Map task is stored on the local disk of the machine the task is executing on A Map task produces R such files on its local disk similar to your Homework 2 where R 101 Output of Reduce task is stored in the GFS High availability via replication The filename of the output produced by a reduce task is deterministic When a reduce task completes the reduce worker atomically renames its temporary output file to the final output file If the same reduce task executes on multiple machines multiple renames calls will be executed for the same output file Sequential versus Parallel Execution Is the results of a sequential execution the same as the parallel execution with failures Sequential versus Parallel Execution Is the results of a sequential execution the same as the parallel execution with failures Depends on the application If Map and Reduce operators are deterministic functions of their input values When a map task completes the worker sends a message
View Full Document
Unlocking...