DOC PREVIEW
CORNELL CS 614 - Epidemic Techniques

This preview shows page 1-2-16-17-18-34-35 out of 35 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Epidemic TechniquesMilo PolteSummary of First PaperEpidemic Algorithms For Replicated Database Maintenance (Demers et al. Proc. of the Sixth ACM Symp. on Principles of Distributed Computing, August 1987)Presents randomized, epidemic algorithms for distributing updates in a replicated database to approach consistencyAnalyses performance of two random epidemic algorithms (anti-entropy and rumor mongering)Implements algorithms in simulation and on Xerox Corporate Internet to measure rate of database consistency and network trafficEmphasizes importance of spatial distributions for efficiencySummary of Second Paper• Astrolabe: A Robust and Scalable Technology For Distributed System Monitoring, Management, and Data Mining (Van Renesse et al.! ACM TOCS, May 2003)– Describes the distributed hierarchical database system, Astrolabe– Uses epidemic techniques to efficiently propagate through the hierarchy and achieve consistency– Presents an SQL-like language for complicated aggregation of data– Incorporates a certificate authority based security modelProblem:• How do we replicate a database across many sites while maintaining consistency?– Many different hosts may have write access to the database– Underlying network is unreliable– We want to avoid unnecessary network trafficTwo unsuccessful approaches:• Each host responsible for propagating their updates directly to all other hosts+ Updates propagated immediately+ No redundant messages sent- Each host must know full membership -- Difficult with churn- Messages may be lost- May saturate critical links- Forces updating node to make O(n) connections• Use primary site update + Simplifies update distribution - Single point of failure/BottleneckAn alternative approach:Use peer-to-peer randomized algorithms to disseminate updates in the network like an epidemic+ Does not require full knowledge of network at any single host+ Works well with unreliable message delivery + Updates spread rapidly as more sites become “infected”- Harder to achieve consistency with randomized algorithm- Reoccurring question: How do we avoid generating tremendous network traffic?1. Direct mail - Each host sends all updates to every other host. Has same pros/cons of the first unsuccessful approach. Not epidemic.2. Anti-entropy - Sites periodically contact other sites and reconcile database with them.3. Rumor mongering - When a site encounters a new update, it begins to gossip it to random sites until the rumor becomes “cold” by some measurement (e.g. many sites contacted already knew rumor).Epidemic MethodsThe first paper describes three techniques for update propagation:Anti-EntropySites pick random partner and exchange database content and resolve differencesOperations referred to as “push”, “pull”, or “push-pull” depending on which direction updates flowExpected time for update to propagate to n hosts using push is logarithmic: log2(n) + ln(n) + c (Pittel, 87)push seems to be used more in practice (e.g. USENET) but pull will propagate updates more rapidly in settings where only a few sites initially do not have the updateTo keep deleted entries from re-propagating through the network, Death Certificates must be distributed and storedCompare TrafficA naive anti-entropy algorithm exchanges entire databases to find differences, generating a prohibitive amount of “compare traffic”Solutions:1. Checksums - still exchanges entire databases when checksums differ)2. Maintaining window of recent updates which are always exchanged. - use checksums to compare databases after applying recent updates- Sensitive to choice of window size3. Exchange updates in reverse chronological order until checksums agree4. Other possibilities include recursive, hierarchical checksums of database or version vectors (e.g. Bayou)Rumor Mongering (Complex Epidemics)A node that hears about an update considers it a “hot rumor” Nodes spread hot rumors to other random nodes At some point nodes consider rumor cold and stop spreading itProblem: Not all nodes may have heard rumor by the time it is considered cold.Can be backed up with anti-entropy to achieve eventual consistencyDeciding When to StopWe want to design an epidemic which minimizes:1. Residue, the ratio of nodes susceptible at the end of the epidemic2. Traffic 3. Delay until most sites know the rumorThe first and third of these desires are in conflict with the secondTwo different stopping policies compared:Simulations on 1000 nodesk = 1234500.30.50.81.0Losing interest after contacting k recipients who already knew the rumorLosing interest with probability 1/k after ever cycle Residuek = 1234501.83.55.37.0TrafficPulling RumorsIn a system with enough update traffic, it might be worthwhile to pull rumors instead for lower residue:k = 12300.050.100.150.20Pushing rumorsPulling rumorsk = 12301.633.254.886.50TrafficResidueMotivation for Spatial AwarenessClearinghouse name serviceA translation database replicated on hundreds of servers on the Xerox Corporate Internet, a world wide network of thousands of hostsRelied on anti-entropy with uniform host selection and direct mail to propagate updatesFound direct mailing was flooding the network but...Even without direct mailing, anti-entropy would saturate key linksSpatial DistributionsToo much randomness seems unwise. We want nodes to infect nodes nearby them.Uniform selection of gossiping partners undesirable. Critical links in the network will face large traffic.In CIN, key transatlantic links would have 80 conversations/round compared to the link average of 6 conversations/roundIncorporating DistanceSites select gossiping partners with probability determined by the distance rank of the nodes and a parameter a.Uniforma = 1.21.41.61.82.0020406080Compare Traffic - AverageUpdate Traffic - AverageCompare Traffic - TransatlanticUpdate Traffic - TranslatlanticConversationsIncorporating Distance (cont.)Sites select gossiping partners with probability determined by the distance rank of the nodes and a parameter a with connection limit of 1:Uniforma = 1.21.41.61.82.0020406080Compare Traffic - AverageUpdate Traffic - AverageCompare Traffic - TransatlanticUpdate Traffic - TranslatlanticConversationsIncorporating Distance (cont.)While it seems that spatial information is critical for network load balancing, it does mean consistency takes longer to reach outer nodes:Uniforma = 1.21.41.61.82.006.312.518.825.0t_last, no connection limitt_last, connection limit


View Full Document

CORNELL CS 614 - Epidemic Techniques

Documents in this Course
Load more
Download Epidemic Techniques
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Epidemic Techniques and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Epidemic Techniques 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?