Selective Early

Home> Academic Documents> Selective Early

DOC PREVIEW

Selective Early

This preview shows page 1-2-3 out of 10 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 10 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 10 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 10 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 10 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Selective Early Request Terminationfor Busy Internet ServicesJingyu ZhouAsk.com and [email protected] YangAsk.com and [email protected] traffic is bursty and network servers are often overloadedwith surprising events or abnormal client request patterns. Thispaper studies a load shedding mechanism called selective early re-quest termination (SERT) for network services that use threads tohandle multiple incoming requests continuously and concurrently.Our investigation with applications from Ask.com shows that dur-ing overloaded situations, a relatively small percentage of long re-quests that require excessive computing resource can dramaticallyaffect other short requests and reduce the overall system through-put. By actively detecting and aborting overdue long requests, ser-vices can perform significantly better to achieve QoS objectivescompared to a purely admission based approach. We have pro-posed a termination scheme that monitors running time of requests,accounts for their resource usage, adaptively adjusts the selectionthreshold, and performs a safe termination for a class of requests.This paper presents the design and implementation of this schemeand describes experimental results to validate the proposed approach.Categories and Subject DescriptorsH.3.5 [Information Systems]: Information Storage and Retrival—Online Information ServicesGeneral TermsPerformance, Design, ExperimentationKeywordsInternet services, Load Shedding, Request Termination1. INTRODUCTIONBusy Internet service providers often use a service-level agree-ment in terms of response time and throughput to guide perfor-mance optimization and guarantees. It is challenging to satisfyperformance requirements of requests at all times because Inter-net traffic is bursty, and resource requirement for dynamic contentis often unknown in advance. Even with over-provisioning of sys-tem resources, a web site can still be overloaded in a short period oftime due to flash crowds under an unexpected high request rate [7,18]. Sometimes an abnormal change in the characteristics of re-quest traffic may also cause service overloading. For example, anCopyright is held by the International World Wide Web Conference Com-mittee (IW3C2). Distribution of these papers is limited to classroom use,and personal use by others.WWW 2006, May 23–26, 2006, Edinburgh, Scotland.ACM 1-59593-323-9/06/0005.unexpected increase in the percentage of long requests or heavy re-quests that require a large amount of processing time can cause theslowness of the overall system performance. This is because longrequests can take over most of the system resource even when therequest rate is relatively low.Previous work has been using admission control [16, 9, 31, 34,36, 6] and adaptive service degradation [3, 10, 36] to curb responsetime spikes during overload. Admission control improves the re-sponse time of admitted client requests by rejecting a subset ofclients. Admission control techniques include using performancefeedbacks [6, 36], bounding the incoming request queue length [16,31], and policing TCP SYN packets [34].In this paper, we explore a new load shedding mechanism thatmonitors request resource usage and terminates overdue long re-quests after they have been admitted to the system. Since a web sitetypically has an expectation to deliver results within a soft dead-line, long requests passing the deadline often have minimal valuefor end users. In addition, an excessive accumulation of long re-quests in a server can significantly reduce the success chance ofother short requests completed within a deadline. By terminatingselected overdue long requests, the system can accommodate moreshort requests during load peaks and thus increase the throughput.Certainly not every request can be terminated safely. In this pa-per we are targeting a class of requests in which the terminationof an individual request does not affect other requests. Examplesof such applications include read-only index matching service insearch engines. Our scheme called SERT monitors running timesand resource usage of requests, adaptively adjusts the terminationthreshold, and aborts selected long requests safely.We have implemented our scheme in the Neptune clustering mid-dleware for network services [31, 29]. An application can link ourlibrary with little or no code modifications in deploying selectiveearly termination of requests. The rest of the paper is organizedas follows. Section 2 discusses the background on multi-threadedInternet services and motivates this work with a micro-benchmark.Section 3 gives the design considerations, architecture, and API ofSERT. Section 4 describes our implementation. Section 5 evalu-ates SERT with application benchmarks from Ask.com. Section 6summarizes related work. Finally, Section 7 concludes the paper.2. BACKGROUND2.1 Cluster-based Internet Services and Con-currency ControlAn Internet service cluster hosts applications handling concur-rent client requests on a set of machines connected by a high speednetwork. A number of earlier studies have addressed providingmiddleware-level support for service clustering, load balancing, andQueryc a c h eD o cs erv erQueryf ro n t en dQueryf ro n t en dQueryf ro n t en dQueryc a c h eQueryc a c h eT i er- 1i n d exT i er- 1i n d exT i er- 1i n d exD o cs erv erD o cs erv erE x t ern a ls ea rc hq ueri esT i er- 2i n d exP a rt i t i o n 1T i er- 2i n d exP a rt i t i o n 1T i er- 2i n d exP a rt i t i o n 1T i er- 2i n d exP a rt i t i o n 2T i er- 2i n d exP a rt i t i o n 2T i er- 2i n d exP a rt i t i o n 2Figure 1: A three-tier keyword-based document search service.availability management [11, 31, 35, 24]. In these systems, a ser-vice component can invoke RPC-like service calls or obtain com-munication channels to other components in the cluster. A complexInternet service cluster often has multiple tiers and service compo-nents depend on each other through service calls.For example, Figure 1 shows the architecture of a three-tier searchengine. This service contains five components: query handlingfront-ends, result cache servers, tier-1 index servers, tier-2 indexservers, and document servers. When a request arrives, one of thequery front-ends parses this request and then contacts the querycaches to see if the query result is already cached. If not, indexservers are accessed to search for matching documents. Note thatindex servers are divided into two tiers. Search


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-3 out of 10 pages.

Please select your school