Cluster Based Scalable Network Services Armando Fox University Steven D Gribble Yatin Chawathe of California at Berkeley Eric A Brewer Paul Gauthier Inktomi Corporation fox gribble yatin biewer cs berkeley edu gauthier inktomi com We identifit three fundamental requirements for scalable network services incremental scalability and oveflow growth provisioning 24x7 availability through fault masking and costeffectiveness We argue that clusters of commodity workstations interconnected by a high speed SAN are exceptionally well suited to meeting these challenges for Internet server workloads provided the software infrastructure for managing partial failures and administering a large cluster does not have to be reinvented for each new service To this end we propose a general layered architecture for building cluster based scalable network services that encapsulates the above requirements for reuse and a service programming model based on composable workers thatpe onn transformation aggregation caching i and customization TACC of Internet content For both performance and implementation simplicity the architecture and TACC programming model exploit BASE a weaker than ACID data semantics that results from trading consistency for availability and relying on sof statefor robustness in failure management Our qychitecture can be used as an off the shelf irtfrastructural platfonn for creating new network services allowing authors to focus on the content of the service by composing TACC building blocks rather than its implementation We discuss two real implemeritationsof services based on this architecture TranSend a Web distillation proxy deployed to the UC Berkeley dialup IP population and HotBot the commercial implementation of the Inktomi search engine We present detailed measurements of TranSend s performance based on substantial client traces as well as anecdotal evidence from the TranSend and HotBot experience to support the claims made for the architecture shows that a large central virtual server is more efficient in both cost and utilization than a collection of smaller servers standalone desktop systems represent the degenerate case of one server per user All of these support the argument for Network Computers Pa However network services remain difficult to deploy because of three fundamental challenges scalability availability and cost effectiveness By scalabiliry we mean that when the load offered to the service increases an incremental and linear increase in hardware can maintain the same per user level of service By availability we mean that the service as a whole must bc available 24x7 despite transient partial hardware or softwaro failures By cost effectiveness we mean that the service must bc economical to administer and expand even though it potentially comprises many workstation nodes We observe that clusters of workstations have some fundnmental properties that can be exploited to meet these requirements using commodity PCs as the unit of scaling allows the service to ride the leading edge of the cost performance curve the inheront redundancy of clusters can be used to mask transient failuros and embarrassingly parallel network service workloads map well onto networks of workstations However developing cluster software and administering a running cluster remain complex The primary contributions of this work are the design analysis and implementation of a layered framework for building network services that addresses this complexity New services can use thls framework as an off the shelf solution to scalability availability and several other problems and focus instead on the content of the service being developed The lower layer handles scalability avallability load balancing support for bursty offered load and system monitoring and visualization while the middle layer provldcs extensible support for caching transformation among MIME types aggregation of information from multiple sources and pcrsonalization of the service for each of a large number of users mass customization The top layer allows composition of transformation and aggregation into a specific service such as accclerated Web browsing or a search engine Pervasive throughout our design and implementation stratgies is the observation that much of the data manipulated by a network service can tolerate semantics weaker than ACID 1261 We combine ideas from prior work on tradeoffs between availability and consistency and the use of soft state for robust fault tolerance to characterize the data semantics of many network services which WCrofor to as BASE semantics basically available soft state eventual consistency In addition to demonstrating how BASE simplilies the implementation of our architecture we present a programming model for service authoring that is a good fit for BASE semantics and that maps well onto our cluster based service framework l l l 1 Introduction One of the overall design goals is to create a computing system which is capable of meeting almost all of the requirements of a large computer utility Such systems must run continuously and reliably 7 days a week 24 hours a day and must be capable of meeting wide service demands Because the system must ultimately be comprehensive and able to aa apt to unknown future requirements its framework must be general and cbpable of evolving over time Corbat6 and Vyssotsky on Multics 1965 I71 Although it is normally viewed as an operating system Multics Multiplexed Information and Computer Service was originally conceived as an infrastructural computing service so it is not surprising that its goals as stated above are similar to our own The primary obstacle to deploying Multics was the absence of the network infrastructure which is now in place Network applications have exploded in popularity in part because they are easier to manage and evolve than their desktop application counterparts they eliminate the need for software distribution and simplify customer service and bug tracking by avoiding the difficulty of dealing with multiple platforms and versions Also basic queueing theory 1 1 Validation Two Real Services permission to make digital hard copy of part or all this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage the copyright notice the title of the publication and its date appear and notice is given that copying is by permission of ACM Inc To copy otherwise to
View Full Document
Unlocking...