DOC PREVIEW
HARVARD CS 263 - Extensible Cluster-Based Scalable Network Services

This preview shows page 1-2-3-4-5 out of 14 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 14 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Extensible Cluster-Based Scalable Network ServicesArmando Fox Steven D. Gribble Yatin Chawathe Eric A. Brewer Paul GauthierUniversity of California at Berkeley Inktomi Corporation{fox, gribble, yatin, brewer}@cs.berkeley.edu {brewer, gauthier}@inktomi.comWe identify three fundamental requirements for scalable net-work services: incremental scalability and overflow growth provi-sioning, 24x7 availability through fault masking, and cost-effectiveness. We argue that clusters of commodity workstationsinterconnected by a high-speed SAN are exceptionally well-suitedto meeting these challenges for Internet-server workloads, pro-vided the software infrastructure for managing partial failures andadministering a large cluster does not have to be reinvented foreach new service. To this end, we propose a general, layered archi-tecture for building cluster-based scalable network services thatencapsulates the above requirements for reuse, and a service-pro-gramming model based on composable workers that perform trans-formation, aggregation, caching, and customization (TACC) ofInternet content. For both performance and implementation sim-plicity, the architecture and TACC programming model exploitBASE, a weaker-than-ACID data semantics that results from trad-ing consistency for availability and relying on soft state for robust-ness in failure management. Our architecture can be used as an “offthe shelf” infrastructural platform for creating new network ser-vices, allowing authors to focus on the “content” of the service (bycomposing TACC building blocks) rather than its implementation.We discuss two real implementations of services based on thisarchitecture: TranSend, a Web distillation proxy deployed to theUC Berkeley dialup population, and HotBot, the commercialimplementation of the Inktomi search engine. We present detailedmeasurements of TranSend’s performance based on substantial cli-ent traces, as well as anecdotal evidence from the TranSend andHotBot experience, to support the claims made for the architecture.1 Introduction“One of the overall design goals is to create a computingsystem which is capable of meeting almost all of therequirements of a large computer utility. Such systems mustrun continuously and reliably 7 days a week, 24 hours aday... and must be capable of meeting wide servicedemands.”“Because the system must ultimately be comprehensiveand able to adapt to unknown future requirements, itsframework must be general, and capable of evolving overtime.”— Corbató and Vyssotsky on Multics, 1965 [15]Although it is normally viewed as an operating system, Multics(Multiplexed Information and Computer Service) was originallyconceived as an infrastructural computing service, so it is not sur-prising that its goals as stated above are similar to our own. Theprimary obstacle to deploying Multics was the absence of the net-work infrastructure, which is now in place. Network applicationshave exploded in popularity in part because they are easier to man-age and evolve than their desktop application counterparts: theyeliminate the need for software distribution, and simplify customerservice and bug tracking by avoiding the difficulty of dealing withmultiple platforms and versions. Also, basic queueing theoryshows that a large central (virtual) server is more efficient in bothcost and utilization than a collection of smaller servers; desktopsystems represent the degenerate case of one “server” per user. Allof these are key parts of the argument for Network Computers [27].However, network services remain difficult to deploy becauseof three fundamental challenges: scalability, availability and costeffectiveness.• By scalability, we mean that when the offered load to theservice increases, an incremental and linear increase inhardware can maintain the same per-user level of service.• By availability, we mean that the service as a whole must beavailable 24x7, despite transient partial hardware or softwarefailures.• By cost effectiveness, we mean that the service must beeconomical to administer and expand, even though itpotentially comprises many workstation nodes.We observe that clusters of workstations have some fundamen-tal properties that can be exploited to meet these requirements:using commodity PCs as the unit of scaling allows the service toride the leading edge of the cost/performance curve, the inherentredundancy of clusters can be used to mask transient failures, and“embarrassingly parallel” network service workloads map wellonto networks of workstations. However, developing cluster soft-ware and administering a running cluster remain complex. The pri-mary contributions of this work are the design and analysis of animplemented layered framework for building network services thataddresses this complexity. New services can use this framework asan off-the-shelf solution to scalability, availability, and severalother problems, and focus instead on the content of the servicebeing developed. The lower layer handles scalability, availability,load balancing, support for bursty offered load, and system moni-toring and visualization, while the middle layer provides extensiblesupport for caching, transformation among MIME types, aggrega-tion of information from multiple sources, and personalization ofthe service for each of a large number of users (mass customiza-tion). The top layer allows composition of transformation andaggregation into a specific service, such as accelerated Web brows-ing or a search engine.Pervasive throughout our design and implementation strategiesis the observation that much of the data manipulated by a networkservice can tolerate semantics weaker than ACID [25]. We combineideas from prior work on availability vs. consistency and the use ofsoft state for robust fault-tolerance to characterize the data seman-tics of many network services, which we refer to as BASE seman-tics (basically available, soft state, eventual consistency). Inaddition to demonstrating how BASE simplifies the implementa-tion of our architecture, we present a programming model for ser-vice authoring that is a good fit for BASE semantics and that mapswell onto our cluster-based service framework.1.1 Validation: Two Real ServicesOur framework reflects the implementation of two real networkservices in use today: TranSend, a scalable transformation andcaching proxy for the 25,000 Berkeley dialup IP users (connectingthrough a bank of 600 modems), and the Inktomi search


View Full Document

HARVARD CS 263 - Extensible Cluster-Based Scalable Network Services

Download Extensible Cluster-Based Scalable Network Services
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Extensible Cluster-Based Scalable Network Services and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Extensible Cluster-Based Scalable Network Services 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?