Atul%Adya%–%Google%John%Dunagan%–%Microso7%Alec%Wolman%–%Microso0%Research %%2%Incoming%Request%(from%Device%D1):%%store%my%current%IP%=%A%…%Front‐end%Web%server%Front‐end%Web%server%Front‐end%Web%server%…%ApplicaGon%Server%(In‐Memory)%ApplicaGon%Server%%(In‐Memory)%ApplicaGonServer%(In‐Memory)%Incoming%Request%(from%Device%D2):%%Tell%me%D1’s%current%IP%addr%Locate%the%App%Server%that%stores%the%contact%info%for%D1%Store%D1’s%IP%addr%Locate%the%App%Server%that%stores%the%contact%info%for%D1%Read%D1’s%IP%addr% Problems:% How%to%assign%responsibility%for%items%to%app%servers?%(parEEoning)% How%to%deal%with%addiEon,%removal,%&%crashes%of%app%servers?% How%to%avoid%requests%for%the%same%item%winding%up%at%different%servers?%(use%leases)% How%to%adapt%to%load%changes?%Targets%class%of%services%with%these%characterisEcs:% InteracEve%(needs%low%latency)%▪ App%servers%operate%on%in‐memory%state% ApplicaEon%Eer%operates%on%cached%data:%the%truth%is%hosted%on%clients%or%back‐end%storage% Services%use%many%small%objects%% Even%the%most%popular%object%can%be%handled%by%one%server%▪ ReplicaEon%not%needed%to%handle%load%3% Prior%systems%implement%leasing%and%parEEoning%separately% We%show%that%integraEng%leasing%and%parEEoning%allows%scaling%to%massive%numbers%of%objects% This%integraEon%requires%us%to%rethink%the%mechanisms%and%API%for%leasing%▪ Manager‐directed%leasing%▪ Non‐tradiEonal%API%where%clients%cannot%request%leases%4% Centrifuge%design% Centrifuge%internals% Results%from%live%deployment%5%6%…%Front‐end%Lookup%Library%Centrifuge%Manager%Service%…%In‐Memory%Server%Owner%Library%In‐Memory%Server%Owner%Library%In‐Memory%Server%Owner%Library%Front‐end%Lookup%Library%Front‐end%Lookup%Library%Lookups:%Front‐End%Web%Ser vers%Owners:%Middle%Tier%%ApplicaGon%Servers% Need%to%issue%leases%for%very%large%#%of%objects% Lease%per%object%will%lead%to%prohibiEve%overhead% Centrifuge%manager%hands%out%leases%on%ranges% Use%consistent%hashing%to%parEEon%a%flat%namespace% Assign%leases%on%conEguous%ranges%of%the%hashed%namespace% One%lease%(one%range)%per%virtual%node%(64%per%server)% Single%mechanism:%manager‐directed%leasing%handles%both%leasing%and%parEEoning%7%Centrifuge%Manager%Service%In‐Memory%Server%Owner%Library%Lease:%0‐50,100‐200%8%Lookup%API%URL%Lookup(Key%key)%void%LossNoEficaEonUpcall(KeyRange[]%lost)%Owner%API%bool%CheckLeaseNow(Key%key,%out%LeaseNum%leaseNum)%bool%CheckLeaseConEnuous(Key%key,%LeaseNum%leaseNum)%…%Front‐end%Lookup%Library%Front‐end%Lookup%Library%Front‐end%Lookup%Library%Incoming%Request:%%Find%Device%“D”%Lookup(“D”)%‐>%“hXp://m6/”%…%Server%“m2”%Owner%Library%Server%“m1”%Owner%Library%Server%“m6”%Owner%Library%1.CheckLeaseNow(“D ”)%‐>%handle%2.Perform%applicaGon%operaGon:%%%%%%find%D’s%current%IP%addr%3.CheckLeaseConGnuous(“D”,%handle)% Ser vers%in%datacenter%environment%are%stable% Benefits% Much%cheaper%to%avoid%holding%mulEple%copies%in%RAM% Avoids%complexity/performance%issues%of%quorum%protocols% Doesn’t%add%extra%complexity:%▪ Need%a%mechanism%to%tolerate%correlated%failures%anyway%(e.g.%security%vulnerabiliEes,%patch%installaEon)% Cost% When%an%applicaEon%server%crashes,%items%are%not%available%unEl%clients%republish%9% When%applicaEon%server%crashes,%Lookups%receive%Loss%NoEficaEons%%% Indicates%which%ranges%are%lost% Allows%the%applicaEon%to%determine%which%clients%should%republish%their%state% Live%Mesh%services%use%this%model% Rely%on%clients%to%recover%state%10% ParEEoning% Manager%spreads%namespace%across%Owners%by%assigning%leases% Consistency% Leases%ensure%single‐copy%guarantee:%at%any%Eme%t,%for%any%key%at%most%one%Owner%node% Recovery% Loss%noEficaEons%enable%app%developer%to%detect%and%recover%from%Owner%crashes% Membership% Owners%indicate%liveness%by%requesEng%leases% Load%Balancing% Manager%rebalances%namespace%based%on%reported%load%11% Centrifuge%design% Centrifuge%internals% Results%from%live%deployment%12% Incremental%protocol%to%synchronize%Lookup%and%Manager%lease%tables% Lookups%are%fast:%no%need%to%contact%Manager%and%incur%delay% Manager%load%not%dependent%on%incoming%request%load%to%Lookups%13%Lookup% Manager%Lease%Table%Current%LSN:4%[0‐1:Owner=A]%[1‐2:Owner=B]%[2‐9:Owner=C]%Change%Log%…%Cached%Lease%Table%Current%LSN:2%…%“I%am%at%LSN%2.”%“Here%are%changes%LSN%2‐>4”%Robustness:%Owners%have%mulEple%opportuniEes%to%retain%their%leases:% Leases%requested%every%15%seconds% Leases%last%60%seconds% Takes%3%consecuEve%lost/delayed%requests%to%lose%the%lease%Safety:%owner%never%thinks%it%has%the%lease%when%the%manager%disagrees% Similar%to%previous%lease%servers,%rely%on%clock%rate%synchronizaEon%14%Owner%“Request%Leases”%Manager%“Leases%granted/recalled”%Lookups%and%%Owners%Leader%Leader%and%Standbys%Paxos%Group%“Yes.”%“Renew%leader%lease%and%%%commit%state%update.”%“Can%I%have%the%leader%lease?”%“No.”%Standby%Standby%Manager%Service%15% Centrifuge%designed%to%run%in%a%single%datacenter% Scalability%target:%~1000%machines%in%1%cluster% Beyond%there,%scale%by%deploying%mulEple%clusters%16% Centrifuge%design% Centrifuge%internals% Results%from%live%deployment%17% First%deployed%in%April%2008% Results%cover%2.5%months:%Dec%’08%–%Mar%‘09% 1000%Lookups,%130%Owners% Manager%=%8%servers%18% Is%the%Centrifuge%manager%a%scalability%borleneck%in%steady‐state?% How%well%does%Centrifuge%handle%high‐churn%events?% How%stable%are%producEon%servers?%19%20%21%22%23% From%1/15/09%through%3/2/09,%no%patch%installaEons% How%stable%were%the%owners%during%this%period?% Servers%are%very%stable:%only%10%lease‐loss%events% 7%cases,%servers%recovered%<%10%minutes% 3%cases,%servers%recovered%<%1%hour% Centrifuge%simplifies%building%scalable%applicaEon%Eers%with%in‐memory%state% Combining%leasing%and%parEEoning%leads%to%a%simple%and%powerful%protocol%
View Full Document