CS 61C Great Ideas in Computer Architecture Machine Structures Instructors Randy H Katz David A Patterson http inst eecs Berkeley edu cs61c sp11 01 14 2019 Apeinf 2011 Lecture 2 1 Review CS61c Learn 5 great ideas in computer architecture to enable high performance programming via parallelism not just learn C 1 2 3 4 5 6 Layers of Representation Interpretation Moore s Law Principle of Locality Memory Hierarchy Parallelism Performance Measurement and Improvement Dependability via Redundancy Post PC Era Parallel processing smart phones to WSC WSC SW must cope with failures varying load varying HW latency bandwidth WSC HW sensitive to cost energy efficiency 01 14 2019 Spring 2011 Lecture 1 2 New School Machine Structures It s a bit more complicated Today s Lecture Software Parallel Requests Assigned to computer e g Search Katz Parallel Threads Assigned to core e g Lookup Ads Hardware Harness Parallelism Achieve High Performance Warehouse Scale Computer Smart Phone Computer Parallel Instructions 1 instruction one time e g 5 pipelined instructions Parallel Data 1 data item one time e g Add of 4 pairs of words Hardware descriptions Memory Core Cache Input Output Instruction Unit s Core Functional Unit s A0 B0 A1 B1 A2 B2 A3 B3 Main Memory All gates one time 01 14 2019 Core Logic Gates Spring 2011 Lecture 1 3 Agenda Request and Data Level Parallelism Administrivia 61C in the News Internship Workshop The secret to getting good grades at Berkeley MapReduce MapReduce Examples Technology Break Costs in Warehouse Scale Computer if time permits 01 14 2019 Apeinf 2011 Lecture 2 4 Request Level Parallelism RLP Hundreds or thousands of requests per second Not your laptop or cell phone but popular Internet services like Google search Such requests are largely independent Mostly involve read only databases Little read write aka producer consumer sharing Rarely involve read write data sharing or synchronization across requests Computation easily partitioned within a request and across different requests 01 14 2019 Apeinf 2011 Lecture 2 5 Google Query Serving Architecture 01 14 2019 Apeinf 2011 Lecture 2 6 Anatomy of a Web Search Google Randy H Katz Direct request to closest Google Warehouse Scale Computer Front end load balancer directs request to one of many arrays cluster of servers within WSC Within array select one of many Google Web Servers GWS to handle the request and compose the response pages GWS communicates with Index Servers to find documents that contain the search words Randy Katz uses location of search as well Return document list with associated relevance score 01 14 2019 Apeinf 2011 Lecture 2 7 Anatomy of a Web Search In parallel Ad system books by Katz at Amazon com Images of Randy Katz Use docids document IDs to access indexed documents Compose the page Result document extracts with keyword in context ordered by relevance score Sponsored links along the top and advertisements along the sides 01 14 2019 Apeinf 2011 Lecture 2 8 01 14 2019 Apeinf 2011 Lecture 2 9 Anatomy of a Web Search Implementation strategy Randomly distribute the entries Make many copies of data aka replicas Load balance requests across replicas Redundant copies of indices and documents Breaks up hot spots e g Justin Bieber Increases opportunities for request level parallelism Makes the system more tolerant of failures 01 14 2019 Apeinf 2011 Lecture 2 10 Data Level Parallelism DLP 2 kinds Lots of data in memory that can be operated on in parallel e g adding together 2 arrays Lots of data on many disks that can be operated on in parallel e g searching for documents March 1 lecture and 3rd project does Data Level Parallelism DLP in memory Today s lecture and 1st project does DLP across 1000s of servers and disks using MapReduce 01 14 2019 Apeinf 2011 Lecture 2 11 Problem Trying To Solve How process large amounts of raw data crawled documents request logs every day to compute derived data inverted indicies page popularity when computation conceptually simple but input data large and distributed across 100s to 1000s of servers so that finish in reasonable time Challenge Parallelize computation distribute data tolerate faults without obscuring simple computation with complex code to deal with issues Jeffrey Dean and Sanjay Ghemawat MapReduce Simplified Data Processing on Large Clusters Communications of the ACM Jan 2008 01 14 2019 Apeinf 2011 Lecture 2 12 MapReduce Solution Apply Map function to user supplier record of key value pairs Compute set of intermediate key value pairs Apply Reduce operation to all values that share same key in order to combine derived data properly User supplies Map and Reduce operations in functional model so can parallelize using reexecution for fault tolerance 01 14 2019 Apeinf 2011 Lecture 2 13 Data Parallel Divide and Conquer MapReduce Processing Map Slice data into shards or splits distribute these to workers compute sub problem solutions map in key in value list out key intermediate value Processes input key value pair Produces set of intermediate pairs Reduce Collect and combine sub problem solutions reduce out key list intermediate value list out value Combines all intermediate values for a particular key Produces a set of merged output values usually just one Fun to use focus on problem let MapReduce library deal with messy details 01 14 2019 Apeinf 2011 Lecture 2 14 MapReduce Execution Fine granularity tasks many more map tasks than machines 2000 servers 200 000 Map Tasks 5 000 Reduce tasks 01 14 2019 Apeinf 2011 Lecture 2 15 MapReduce Popularity at Google Number of MapReduce jobs Average completion time secs Server years used Input data read TB Intermediate data TB Output data written TB Average number servers job 01 14 2019 Aug 04 29 000 Mar 06 Sep 07 Sep 09 171 000 2 217 000 3 467 000 634 217 3 288 758 193 874 2 002 52 254 6 743 2 970 395 11 081 403 152 34 774 14 018 475 25 562 544 130 90 120 57 520 157 268 394 488 Apeinf 2011 Lecture 2 16 Google Uses MapReduce For Extracting the set of outgoing links from a collection of HTML documents and aggregating by target document Stitching together overlapping satellite images to remove seams and to select high quality imagery for Google Earth Generating a collection of inverted index files using a compression scheme tuned for efficient support of Google search queries Processing all road segments in the world and rendering map tile images that display these segments for Google Maps Fault tolerant parallel execution of programs
View Full Document
Unlocking...