Tashi: Location-aware Cluster Management

Home> Academic Documents> Tashi: Location-aware Cluster Management

DOC PREVIEW

This preview shows page 1-2 out of 6 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 6 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 6 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 6 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Tashi Location aware Cluster Management Michael A Kozuch Michael P Ryan Richard Gass Steven W Schlosser David O Hallaron michael a kozuch michael p ryan richard gass steven w schlosser david ohallaron intel com Intel Research Pittsburgh 4720 Forbes Avenue Pittsburgh PA 15213 James Cipar Elie Krevat Julio L pez Michael Stroucken Gregory R Ganger jcipar ekrevat cs cmu edu jclopez stroucki ganger ece cmu edu Parallel Data Laboratory Carnegie Mellon University 5000 Forbes Avenue Pittsburgh PA 15213 ABSTRACT Big Data applications those that require large data corpora either for correctness or for fidelity are becoming increasingly prevalent Tashi is a cluster management system designed particularly for enabling cloud computing applications to operate on repositories of Big Data These applications are extremely scalable but also have very high resource demands A key technique for making such applications perform well is Location Awareness This paper demonstrates that location aware applications can outperform those that are not location aware by factors of 3 11 and describes two general services developed for Tashi to provide location awareness independently of the storage system Categories and Subject Descriptors D 4 7 Operating Systems Organization and Design Distributed Systems General Terms Design Keywords cluster management cloud computing virtualization 1 INTRODUCTION Big data computing is perhaps the biggest innovation in computing in the last decade 2 Increasingly the most interesting computer applications are those that compute on large data corpora These Big Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page To copy otherwise to republish to post on servers or to redistribute to lists requires prior specific permission and or a fee ACDC 09 June 19 2009 Barcelona Spain Copyright 2009 ACM 978 1 60558 585 7 09 06 5 00 Data applications draw from information sources such as web crawls digital media collections virtual worlds simulation traces and data obtained from scientific or medical instruments Historically only of interest to a narrow segment of the computing community Big Data applications now play a significant role in all aspects of society from scientific study to enterprise data mining to consumer web applications These applications beyond simply operating on data that is big are also typically data hungry in that the quality of their results improves with the quantity of data available Consequently a strong need exists for computing technologies that are scalable to accommodate the largest datasets possible Fortunately these applications are typically disk bandwidth limited rather than seek limited and exhibit extremely good parallelism Therefore commodity cluster hardware when employed at scale may be harnessed to support such large dataset applications For example the cluster at Intel Research Pittsburgh that is part of the OpenCirrus consortium http opencirrus org consists of a modest 150 server nodes yet provides more than 1000 computing cores and over 400 TB of disk storage enough to accommodate many current Big Data problems One unfortunate ramification of data set size is potentially inescapable namely that Big Data sets are relatively immobile With a 1 Gbps connection to the Internet moving a 100 TB data set into or out of a cluster like the one mentioned above would require approximately 10 days The actual state of affairs is worse access to the Pittsburgh cluster is through a T3 45 Mbps connection Consequently unless the ratio of transfer bandwidth to data set size increases dramatically computation on Big Data sets will be in situ In other words Big Data facilities will host not only data sets but also all the computation that operates on those data sets A site will enable such computation through one or both of two models In the first model sites provide a traditional query service through a narrowly defined interface e g contemporary image search In the second model which combines Cloud Computing with Big Data sites provide a computation hosting framework such as a virtual machine hosting service where users bring their own cus 6000 BWdisk p cores 4000 3000 2000 1000 d disks 3 5X BWnode 11X Connections to R Racks 5000 3 6X TOR Switch Cluster Switch Data Throughput Gb s BWswitch Random Placement Location Aware Placement 9 2X external network 0 Rack of N server nodes Figure 1 Example cluster organization with minimal networking The uplinks from the Top of Rack TOR switches to the Cluster Switch often introduce communication bottlenecks tom applications to the facility to operate on the data Because of the flexibility provided by the second approach we believe that in the future hosted computation will play an increasingly significant role in the consumption of Big Data The authors are currently developing an open source virtual machine based cluster management software package called Tashi that is designed to support Cloud Computing applications that operate on Big Data Currently Tashi is in production use at the OpenCirrus site mentioned above and the project is hosted by the Apache Software Foundation incubator A key feature of Tashi is its support for location aware computing which can impact performance by a factor of 3 11 even in modestly sized clusters as shown in Section 2 This paper presents two basic services described in Section 3 that support location aware computing These services a Data Location Service and a Resource Telemetry Service provide a standard interface between application runtimes and storage systems and are essential to optimizing the performance of Big Data applications 2 SYSTEM CONSIDERATIONS Figure 1 depicts the hardware organization we assume for modest sized Big Data clusters The server nodes are organized into R racks each rack contains N nodes Each node contains p processors and d disk units which are either traditional magnetic disks or solid state drives SSDs and all nodes in the rack are connected to a commodity switch the Top of Rack TOR switch The uplinks of the R TOR switches are connected to a single cluster switch or router The Big Data repository is assumed to be distributed across the disk devices in the server nodes as opposed to being maintained in dedicated storage This simple


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2 out of 6 pages.

Please select your school