Jacob Leverich, Christos KozyrakisOn the Energy (In)efficiency of Hadoop ClustersJacob Leverich, Christos KozyrakisPresented by: Rini KaushikWhy is energy-management important? US datacenters Energy costs (EPA): 2003 - $2Billion 2011 - $10 Billion ~1.6% of all energy consumed (Un)Green 12M tons of CO2 annually* Servers worldwide 2005 - 27.3 million (Information Week)*Jeff Chase et. al., Managing Energy and Server Resources in Hosting centers* J. Jackson, energy needs in an internet economy: a closer look at the datacentersWhere is the energy consumed?PUE = 3.02011 = 1.4Courtesy: Luiz André Barroso, Urs Hölzle2011 = 1.4Towards energy-efficiencyOpportunity: Reality on CPU UtilizationRef: The Case for Energy-Proportional Computing, Luiz André Barroso and Urs HölzlePower variation in a typical serverCourtesy: Luiz André Barroso, Urs HölzlePower vs efficiencyRef: The Case for Energy-Proportional Computing, Luiz André Barroso and Urs HölzleEfficiency = utilization/powerEasy energy-efficient option Scale-down Match number of active nodes to workload needs Turn-off remaining nodes to save power Multiple papers use this approach: Managing Energy and Server Resources in Hosting Centers, SOSP’2001 Easy whenOnly requires computation consolidationOnly requires computation consolidation Servers stateless (i.e., serving data that resides on a shared NAS or SAN) Simple replication model Workloads can be migrated to fewer machines during periods of low activity Hard when Servers with significant state Data locality importantHadoop Primer Distributed data processing framework The MapReduce programming model has emerged as a scalable way to perform data-intensive computations on commodity cluster computersCommodity datacenterCommodity datacenter HDFSUnique scale-down challenges of Hadoop clusters Computation and data co-located on servers Servers stateful Servers rarely completely idle Design principles: Load Balancing for better performance Even in low activity, low load in multiple servers than high load in few servers Data striped across nodeso High aggregate IO Commodity servers usage raises reliability and availability concerns N-way replication a norm Result - Hard to turn-off serversScale-down Opportunity: BlockReplication Invariants No two replicas on same node Replicas on atleast two racks If inactive node turned down,data still available on replica1 2 3 4 5 6 7 8 9NodeAdata still available on replicaNaïve approach Only n-1 servers can be turned-off At best, only one rack off Otherwise, availability affectedABCDEFGHCourtesy: Leverich, HotPower’09Raises QuestionsWhich node to disable? Data availability considerationHow to distinguish sleeping node from Down node?To prevent rereplication1 2 3 4 5 6 7 8 9NodeABCTo prevent rereplicationBlockCDEFGHCovering subset invariant Invariant:Every block must have one replica in the covering subset.Covering subset considerations Too large - less energy savings - Rest of the system suffers bottlenecks + Performance of the covering set goodToo smallToo small - Limited in storage capacity - Performance bottleneck + higher energy saving Paper assumes 10 – 30%Missing considerations and issues Assume system admin will establish covering subset Has no knowledge of the workload patterns No adaptability Adhoc 10 – 30% allocation of set can have serious consequences on performance and not cognizant of the consequences on performance and not cognizant of the workload patterns Number of files not accounted forChanges to Hadoop ReplicationTargetChooser One replica in local node One replica in covering subset One replica on a different rackNo re-replication of the blocks on sleeping nodesNo re-replication of the blocks on sleeping nodes Nodes disabled and enabled manuallyEvaluation Disable n nodes, compare Hadoop job energy & perf. Individual runs of webdata_sort/webdata_scan from GridMix 30 minute job batches (with some idle time!) Cluster36 nodes, HP ProliantDL140 G336 nodes, HP ProliantDL140 G3 2 quad-core Xeon 5335s each, 32GB RAM, 500GB disk 9-node covering subset (1/4 of the cluster) Energy model Validated estimate based on CPU utilization Disabled node = 0 Watts Possible to evaluate hypothetical hardwareResults: Performance It slows down (obviously) Peak performance benchmarkSort worse off than ScanSort worse off than ScanResults: Energy Less energy consumed for same amount of work 9% to 51% savedEvaluationInteresting observation – power goes down as the number of sleeping nodes is increasedHowever, energy-consumption may not.Energy = power X timeCost = Energy X cost/KwhSleeping nodes ^ performance v powerSort – 9%, Scan – 51% energy savingPerformance impact Sort - 71%Discussion Used a very small dataset in their experiments Made a statement that there is no impact on data availability which is incorrect Fault injection experiments neededAssumed a power model where power used is dependent Assumed a power model where power used is dependent only on the cpu utilization. This may not be accurate. IO bound benchmarks will have a different characteristic. Replication is meant for performance also Hot spots Tradeoff between availability, performance and energy-efficiencyFuture work Impact on durability of sleeping nodes Revisiting reliability via replication assumption Replication does have performance implications Dynamic schedulingResponds to changes in utilization of the clusterResponds to changes in utilization of the cluster Collaboration between the hadoop’s job scheduler and power controller Different workloads and their characteristics Some may value QoS and throughput more than
View Full Document