CIS 700/003: Distributed Systems tS ilN t kmeet Social NetworksDecentralized systemsyFebruary 2, 2010© 2010 Andreas Haeberlen1University of PennsylvaniaLitiLogisticsTimeline for the project: Feb 9: Project proposal due Mar 4: WiP Mar 6: Status report dueApr 22: Inclass presentationApr 22: In-class presentation Apr 27: Final report due This week would be a good time tochoose a project topic, or see me for a list of suggestions;choose a project topic, or see me for a list of suggestions; meet with me and/or your advisor and get feedback on your project topic (or survey topic);df fl© 2010 Andreas Haeberlen write a draft of your one-page proposal2University of PennsylvaniaWh t i d F b 9?What is due on Feb 9?A short paper that describes:A short paper that describes:Th bl i lTh bl h dResearch project option: Survey paper option:•The problem you are trying to solve• The approach you are planning to take• How you intend to evaluate your solution•At least two papers related to the problem•The problem or research area coveredby your survey• A set of at least three papers you planto surveyFormat: Two columns 10point font single•At least two papers related to the problem• Your proposed milestones• Reasons why your choice of papers isappropriate for this topicFormat: Two columns, 10-point font, single-spaced, up to one pageSb fl b lSubmission: As a PDF file, by email Deadline: 12 noon on Feb 9© 2010 Andreas Haeberlen3University of Pennsylvaniahttp://www.cis.upenn.edu/~ahae/teaching/cis700-s10/project.htmlWh ?Where are we?MtdlifMADtMeasurement and analysis of MAD systems Building MAD systems Decentralization NegotiationMitiTodayMonitoring Faults and misbehavior Internet crime Privacy and confidentialityyy Novel opportunitiesExperience© 2010 Andreas HaeberlenExperience4University of PennsylvaniaMdi i itMy discussion pointsWhy not just use IP multicast? Akamai?Why not just use IP multicast? Akamai? Examples of 'cooperative environments'? What about freeloaders? Is this a well-written paper? Why, or why not?pp y, y Is the evaluation convincing? Why, or why not?Would you watch the Superbowl over this?Would you watch the Superbowl over this? Is all that complexity necessary?Achieved load balanceAchieved load balance© 2010 Andreas Haeberlen5University of PennsylvaniaSfdiiitSome of your discussion pointsSpare capacity group is uglySpare capacity group is ugly Evaluation: Good? Bad?Feasibility of a tree decidable up front?Feasibility of a tree decidable up front? Correctness and complexity sectionCitthtl?Comparison to other protocols? Comparison to BitTorrentItfbldtImpact of unbalanced trees Impact of queueing delay, losses, cross traffic© 2010 Andreas Haeberlen6University of PennsylvaniaWh t IP lti t CDN?Why not IP multicast or a CDN? IP multicast Would be very efficient; low link stress Widely supported by routers today But: Typically not enabled! Why?Often blamed on difficulty of charging for multicast trafficOften blamed on difficulty of charging for multicast trafficCommercial CDNCommercial CDN Widely available todayCan provide excellent quality of serviceCan provide excellent quality of service But: CDN charges for delivery Difficult to afford for 'grassroots' services, non-commercial © 2010 Andreas Haeberlencontent, ...7University of PennsylvaniaWh t i i l b t thi ?What is special about this?Splitstream is a decentralized systemdecentralizedSplitstream is a decentralized system, specifically a structured P2P overlay networkdecentralizedstructuredP2Poverlay© 2010 Andreas Haeberlen8University of PennsylvaniaAt1DtlitiAspect 1: DecentralizationCentralized versus decentralizedCentralized versus decentralized Pro: Centralized is technically simpler Con: Central domain must provide resources for everyone; © 2010 Andreas Haeberlenpy;harder to scale; single point of failure9University of PennsylvaniaAt2PtAspect 2: Peer-to-peerNetwork with InfrastructurePeer-to-Peer network Peer-to-peer systems are highly symmetricIn a true peertopeer system:In a true peer-to-peer system: Nodes have similar capabilities Any node can assume any functionGood forself-organization!Good for fairness!© 2010 Andreas Haeberlenyy No node is more important than any other10University of PennsylvaniaGood for resilience!self-organization!At3Ol tkAspect 3: Overlay networkNew functionality isNew functionality is hard to deploy: MulticastMulticast Pub/sub networkEnhanced routingEnhanced routing Solution: Build an overlay networkusingoverlay networkusing connectivity from an existing network Works without support from the core network© 2010 Andreas Haeberlen11University of Pennsylvania Robustness, scalability?At4SttProblem: Finding things isAspect 4: StructureProblem: Finding things is expensive & unreliable (need to flood)Node(need to flood) Idea: Add structureGive unique identifiers to Unstructured (e.g. Gnutella)ObjectGeuquedetestoboth nodes and objects Store objects on the node(s) with the closest identifier(s)8934with the closest identifier(s) Result: Perfect recall for simple key-34425139Perfect recall for simple keybased queries (common) More complex queries can bi l d51Identifierspace© 2010 Andreas Haeberlen 12be implemented on topStructured (e.g. Pastry)KbdRtiVery common primitive inKey-based Routing89Very common primitive in structured overlays: key-based routing8934g route(message, key): Delivers message to the live node whose key is425139live node whose key is numerically closest to the message's key51 Implementation in Pastry: O(log N) links per nodeO(l N)hApplicationKBR InterfaceO(log N)hops per msg Efficient; scalable to very large networksRouting substrate(Pastry)© 2010 Andreas Haeberlen 13large networksNetwork (e.g. IP)St i thP2P diSystems using the P2P paradigm Key-based routing systems: Pastry, Chord, ...Storage systems: PAST, Farsite, OceanStore, ...Storage systems: PAST, Farsite, OceanStore, ... Content distribution systems: Coral, ESM, ...Name resolution: CoDNS SFRName resolution: CoDNS, SFR, ... Messaging: ePOST, UsenixDHT, ... Anonymity: Tarzan, AP3, ... Search ...© 2010 Andreas Haeberlen14University of PennsylvaniaCh ll ChChallenge: ChurnSource: Saroiu et al., MMCN 2002Source: Bolosky et al.,
View Full Document