This preview shows page 1-2-3-4-5-6-7-8-56-57-58-59-60-61-62-63-113-114-115-116-117-118-119-120 out of 120 pages.
The Datacenter as a Computer: An Introduction to the Design ofWarehouse-Scale MachinesSynthesis Lectures on Computer ArchitectureAbstractKeywordsAcknowledgmentsNote to the ReaderContentsChapter 1 Introduction1.1 WAREHOUSE-SCALE COMPUTERS 1.2 EMPHASIS ON COST EFFICIENCY 1.3 NOT JUST A COLLECTION OF SERVERS 1.4 ONE DATACENTER VS. SEVERAL DATACENTERS 1.5 WHY WSCs MIGHT MATTER TO YOU 1.6 ARCHITECTURAL OVERVIEW OF WSCs 1.6.1 Storage 1.6.2 Networking Fabric 1.6.3 Storage Hierarchy 1.6.4 Quantifying Latency, Bandwidth, and Capacity 1.6.5 Power Usage 1.6.6 Handling Failures Chapter 2 Workloads and Software Infrastructure2.1 DATACENTER VS. DESKTOP 2.2 PERFORMANCE AND AVAILABILITY TOOLBOX 2.3 CLUSTER-LEVEL INFRASTRUCTURE SOFTWARE 2.3.1 Resource Management 2.3.2 Hardware Abstraction and Other Basic Services 2.3.3 Deployment and Maintenance 2.3.4 Programming Frameworks 2.4 Application-Level Software 2.4.1 Workload Examples 2.4.2 Online: Web Search 2.4.3 Offline: Scholar Article Similarity 2.5 A MONITORING INFRASTRUCTURE 2.5.1 Service-Level Dashboards 2.5.2 Performance Debugging Tools 2.5.3 Platform-Level Monitoring 2.6 Buy vs. Build 2.7 FURTHER READING Chapter 3 Hardware Building Blocks3.1 COST-EFFICIENT HARDWARE 3.1.1 How About Parallel Application Performance? 3.1.2 How Low-End Can You Go? 3.1.3 Balanced Designs Chapter 4 Datacenter Basics4.1 DATACENTER TIER CLASSIFICATIONS 4.2 DATACENTER POWER SYSTEMS 4.2.1 UPS Systems 4.2.2 Power Distribution Units 4.3 DATACENTER COOLING SYSTEMS 4.3.1 CRAC Units 4.3.2 Free Cooling 4.3.3 Air Flow Considerations 4.3.4 In-Rack Cooling 4.3.5 Container-Based Datacenters Chapter 5 Energy and Power Efficiency5.1 DATACENTER ENERGY EFFICIENCY 5.1.1 Sources of Efficiency Losses in Datacenters 5.1.2 Improving the Energy Efficiency of Datacenters 5.2 MEASURING THE EFFICIENCY OF COMPUTING 5.2.1 Some Useful Benchmarks 5.2.2 Load vs. Efficiency 5.3 ENERGY-PROPORTIONAL COMPUTING 5.3.1 Dynamic Power Range of Energy-Proportional Machines 5.3.2 Causes of Poor Energy Proportionality 5.3.3 How to Improve Energy Proportionality 5.4 RELATIVE EFFECTIVENESS OF LOW-POWER MODES 5.5 THE ROLE OF SOFTWARE IN ENERGY PROPORTIONALITY5.6 DATACENTER POWER PROVISIONING 5.6.1 Deployment and Power Management Strategies 5.6.2 Advantages of Oversubscribing Facility Power 5.7 TRENDS IN SERVER ENERGY USAGE 5.8 CONCLUSIONS 5.8.1 Further Reading Chapter 6 Modeling Costs6.1 CAPITAL COSTS 6.2 OPERATIONAL COSTS 6.3 CASE STUDIES 6.3.1 Real-World Datacenter Costs 6.3.2 Modeling a Partially Filled Datacenter Chapter 7 Dealing with Failures and Repairs7.1 IMPLICATIONS OF SOFTWARE-BASED FAULT TOLERANCE7.2 CATEGORIZING FAULTS 7.2.1 Fault Severity 7.2.2 Causes of Service-Level Faults 7.3 MACHINE-LEVEL FAILURES 7.3.1 What Causes Machine Crashes? DRAM soft-errorsDisk errors7.3.2 Predicting Faults 7.4 REPAIRS 7.5 TOLERATING FAULTS, NOT HIDING THEM Chapter 8 Closing Remarks8.1 HARDWARE 8.2 SOFTWARE 8.3 ECONOMICS 8.4 KEY CHALLENGES 8.4.1 Rapidly Changing Workloads 8.4.2 Building Balanced Systems from Imbalanced Components 8.4.3 Curbing Energy Usage 8.4.4 Amdahl’s Cruel Law 8.5 CONCLUSIONS ReferencesAuthor BiographiesThe Datacenter as a ComputerAn Introduction to the Design of Warehouse-Scale MachinesEditorMark D. Hill, University of Wisconsin, MadisonSynthesis Lectures on Computer Architecture publishes 50 to 150 page publications on topics pertaining to the science and art of designing, analyzing, selecting and interconnecting hardware components to create computers that meet functional, performance and cost goals.The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale MachinesLuiz André Barroso and Urs Hölzle2009Computer Architecture Techniques for Power-EfficiencyStefanos Kaxiras and Margaret Martonosi2008Chip Multiprocessor Architecture: Techniques to Improve Throughput and LatencyKunle Olukotun, Lance Hammond, James Laudon2007Transactional MemoryJames R. Larus, Ravi Rajwar2007Quantum Computing for Computer ArchitectsTzvetan S. Metodi, Frederic T. Chong2006iiiSynthesis Lectures on Computer ArchitectureCopyright © 2009 by Morgan & ClaypoolAll rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means—electronic, mechanical, photocopy, recording, or any other except for brief quotations in printed reviews, without the prior permission of the publisher.The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale MachinesLuiz André Barroso and Urs Hölzlewww.morganclaypool.comISBN: 9781598295566 paperbackISBN: 9781598295573 ebookDOI: 10.2200/S00193ED1V01Y200905CAC006A Publication in the Morgan & Claypool Publishers seriesSYNTHESIS LECTURES ON COMPUTER ARCHITECTURELecture #6Series Editor: Mark D. Hill, University of Wisconsin, MadisonSeries ISSNISSN 1935-3235 printISSN 1935-3243 electronicThe Datacenter as a ComputerAn Introduction to the Design of Warehouse-Scale MachinesLuiz André Barroso and Urs HölzleGoogle Inc.SYNTHESIS LECTURES ON COMPUTER ARCHITECTURE # 6ABSTrACTAs computation continues to move into the cloud, the computing platform of interest no longer re-sembles a pizza box or a refrigerator, but a warehouse full of computers. These new large datacenters are quite different from traditional hosting facilities of earlier times and cannot be viewed simply as a collection of co-located servers. Large portions of the hardware and software resources in these facilities must work in concert to efficiently deliver good levels of Internet service performance, something that can only be achieved by a holistic approach to their design and deployment. In other words, we must treat the datacenter itself as one massive warehouse-scale computer (WSC). We describe the architecture of WSCs, the main factors influencing their design, operation, and cost structure, and the characteristics of their software base. We hope it will be useful to architects and programmers of today’s WSCs, as well as those of future many-core platforms which may one day implement the equivalent of today’s WSCs on a single board.vi KEyWorDScomputer organization and design, Internet services, energy efficiency, fault-tolerant computing, cluster computing, data centers, distributed systems, cloud computing.viiWhile we draw from our direct involvement in Google’s infrastructure design and operation over the past several years, most of what we have learned and now report here is the result of the hard
View Full Document