CMU CS 15744 - Lecture - D289561

Home> Schools> Carnegie Mellon University> Computer Science (CS) > CS 15744> Lecture

DOC PREVIEW

CMU CS 15744 - Lecture

School name Carnegie Mellon University

Course Cs 15744- Computer Networks

Pages 13

This preview shows page 1-2-3-4 out of 13 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 13 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 13 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 13 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 13 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 13 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

1 15-744: Computer Networking L-16 Naming 2 Overview • P2P Lookup Overview • Centralized/Flooded Lookups • Routed Lookups – Chord • Comparison of DHTs 3 Peer-to-Peer Networks • Typically each member stores/provides access to content • Has quickly grown in popularity • Bulk of traffic from/to CMU is P2P! • Basically a replication system for files • Always a tradeoff between possible location of files and searching difficulty • Peer-to-peer allow files to be anywhere  searching is the challenge • Dynamic member list makes it more difficult • What other systems have similar goals? • Routing, DNS 4 The Lookup Problem Internet N1 N2 N3 N6 N5 N4 Publisher Key=“title” Value=MP3 data… Client Lookup(“title”) ?2 5 Centralized Lookup (Napster) Publisher@ Client Lookup(“title”) N6 N9 N7 DB N8 N3 N2 N1 SetLoc(“title”, N4) Simple, but O(N) state and a single point of failure Key=“title” Value=MP3 data… N4 6 Flooded Queries (Gnutella) N4 Publisher@ Client N6 N9 N7 N8 N3 N2 N1 Robust, but worst case O(N) messages per lookup Key=“title” Value=MP3 data… Lookup(“title”) 7 Routed Queries (Chord, etc.) N4 Publisher Client N6 N9 N7 N8 N3 N2 N1 Lookup(“title”) Key=“title” Value=MP3 data… 8 Overview • P2P Lookup Overview • Centralized/Flooded Lookups • Routed Lookups – Chord • Comparison of DHTs3 9 Centralized: Napster • Simple centralized scheme  motivated by ability to sell/control • How to find a file: •On startup, client contacts central server and reports list of files •Query the index system  return a machine that stores the required file •Ideally this is the closest/least-loaded machine •Fetch the file directly from peer 10 Centralized: Napster • Advantages: •Simple •Easy to implement sophisticated search engines on top of the index system • Disadvantages: •Robustness, scalability •Easy to sue! 11 Flooding: Old Gnutella • On startup, client contacts any servent (server + client) in network • Servent interconnection used to forward control (queries, hits, etc) • Idea: broadcast the request • How to find a file: • Send request to all neighbors • Neighbors recursively forward the request • Eventually a machine that has the file receives the request, and it sends back the answer • Transfers are done with HTTP between peers 12 Flooding: Old Gnutella • Advantages: • Totally decentralized, highly robust • Disadvantages: • Not scalable; the entire network can be swamped with request (to alleviate this problem, each request has a TTL) • Especially hard on slow clients • At some point broadcast traffic on Gnutella exceeded 56kbps – what happened? • Modem users were effectively cut off!4 13 Flooding: Old Gnutella Details • Basic message header • Unique ID, TTL, Hops • Message types • Ping – probes network for other servents • Pong – response to ping, contains IP addr, # of files, # of Kbytes shared • Query – search criteria + speed requirement of servent • QueryHit – successful response to Query, contains addr + port to transfer from, speed of servent, number of hits, hit results, servent ID • Push – request to servent ID to initiate connection, used to traverse firewalls • Ping, Queries are flooded • QueryHit, Pong, Push reverse path of previous message 14 Flooding: Old Gnutella Example Assume: m1’s neighbors are m2 and m3; m3’s neighbors are m4 and m5;… A B C D E F m1 m2 m3 m4 m5 m6 E? E? E? E? E E E 15 Flooding: Gnutella, Kazaa • Modifies the Gnutella protocol into two-level hierarchy • Hybrid of Gnutella and Napster • Supernodes • Nodes that have better connection to Internet • Act as temporary indexing servers for other nodes • Help improve the stability of the network • Standard nodes • Connect to supernodes and report list of files • Allows slower nodes to participate • Search • Broadcast (Gnutella-style) search across supernodes • Disadvantages • Kept a centralized registration allowed for law suits  16 Overview • P2P Lookup Overview • Centralized/Flooded Lookups • Routed Lookups – Chord • Comparison of DHTs5 17 Routing: Structured Approaches • Goal: make sure that an item (file) identified is always found in a reasonable # of steps • Abstraction: a distributed hash-table (DHT) data structure • insert(id, item); • item = query(id); • Note: item can be anything: a data object, document, file, pointer to a file… • Proposals • CAN (ICIR/Berkeley) • Chord (MIT/Berkeley) • Pastry (Rice) • Tapestry (Berkeley) • … 18 Routing: Chord •Associate to each node and item a unique id in an uni-dimensional space •Properties •Routing table size O(log(N)) , where N is the total number of nodes •Guarantees that a file is found in O(log(N)) steps 19 Aside: Hashing • Advantages • Let nodes be numbered 1..m • Client uses a good hash function to map a URL to 1..m • Say hash (url) = x, so, client fetches content from node x • No duplication – not being fault tolerant. • One hop access • Any problems? • What happens if a node goes down? • What happens if a node comes back up? • What if different nodes have different views? 20 Robust hashing • Let 90 documents, node 1..9, node 10 which was dead is alive again • % of documents in the wrong node? • 10, 19-20, 28-30, 37-40, 46-50, 55-60, 64-70, 73-80, 82-90 • Disruption coefficient =  • Unacceptable, use consistent hashing – idea behind Akamai!6 21 Consistent Hash • “view” = subset of all hash buckets that are visible • Desired features • Balanced – in any one view, load is equal across buckets • Smoothness – little impact on hash bucket contents when buckets are added/removed • Spread – small set of hash buckets that may hold an object regardless of views • Load – across all views # of objects assigned to hash bucket is small 22 Consistent Hash – Example • Smoothness  addition of bucket does not cause much movement between existing buckets • Spread & Load  small set of buckets that lie near object • Balance  no

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-3-4 out of 13 pages.

CMU CS 15744 - Lecture

Sign up for free to view:

Please select your school