USF CS 682 - Distributed Hash Tables - D375926

Home> Schools> University of San Francisco> Computer Science (CS) > CS 682> Distributed Hash Tables

DOC PREVIEW

USF CS 682 - Distributed Hash Tables

School name University of San Francisco

Course Cs 682- Distributed Software Development

Pages 32

This preview shows page 1-2-15-16-31-32 out of 32 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 32 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 32 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 32 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 32 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 32 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 32 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 32 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Network Characteristics Network Characteristics Network Characteristics Network Characteristics Applications Structured P2P networks Distributed Hash Tables DHT - design goals Hash tables - review Consistent Hashing Chord Chord nodes Example Simple Routing Actual Chord Routing Example Routing Routing Routing Node Joining Node Joining Node Joining Node joining Example Node Joining Stabilizing Stabilizing Failure Discussion Summary Next TimeDistributed SoftwareDevelopmentDistributed Hash TablesChris BrooksDepartment of Computer ScienceUniversity of San FranciscoDepartment of Computer Science — University of San Francisco – p. 1/??Network Characteristics•What are some qualities of unstructured P2P networks?Department of Computer Science — University of San Francisco – p. 2/??Network Characteristics•What are some qualities of unstructured P2P networks?•Peers can connect to any o ther pee r•Peers can choose what data to store•No fixed peer addressing scheme•Peers can easily come and g oDepartment of Computer Science — University of San Francisco – p. 3/??Network Characteristics•What are the implications of this?Department of Computer Science — University of San Francisco – p. 4/??Network Characteristics•What are the implications of this?•Search may be slow or incomplete•Data may not always be available•Data may be difficult to locate•It may be difficult to find a particular peerDepartment of Computer Science — University of San Francisco – p. 5/??Applications•What sorts of applications are a good fit for this sort ofnetwork?•What sorts of applications are a poor fit?Department of Computer Science — University of San Francisco – p. 6/??Structured P2P networks•So far, we’ve focused on P2P apps in which users may jointhe network at any point, and may store whatever data theywant.•Network has no particular topology.•No guarantees that all nodes in the network can reacheach other.•These are referred to as unstructured P2P networks.•We can also talk about structured P2P networks•Topolog y, number of connections, file placement is fi xed.Department of Computer Science — University of San Francisco – p. 7/??Distributed Hash Tables•The most common structured P2P application is thedistributed hash table•We can build a file system or storage scheme o n top ofthis.•Supports a single operation: map key s to node IDs.•Key is a unique data identifier.Department of Computer Science — University of San Francisco – p. 8/??DHT - design goals•We would like for DHTs to have the fo llowing prope rties:•Data can always be found if it is in the network.•Each node only has to know about a small number ofother nodes.•Number o f messages needed to find data is small•Can tolerate crash failure•Can handle nodes entering and leavingDepartment of Computer Science — University of San Francisco – p. 9/??Hash tables - review•Hash tables provide a mapping from keys to values.•They do this by means of a hash function•For example, modulo can be used as a hash function•Good hash functions distribute hashed values evenly.•Problem: What if we add buckets to our hash table?•Might need to re-hash all our data!•How can we av oid this?Department of Computer Science — University of San Francisco – p. 10/??Consistent Hashing•Consistent hashing is a particular type of hash function.•It has the property that when the numbe r of buckets (ornodes in the network) is changed, at most O(lgN ) itemswill need to be re-hashed.•This means that, in a distributed hash table, we canefficiently add or subtract nodes without large-scaleupdating.Department of Computer Science — University of San Francisco – p. 11/??Chord•Our discussion of DHTs will f ocus on Chord.•Developed at MIT•One of four DHT schemes (Can, Pastry, Tapestry arethe others) developed co ntemperaneously.•Designed to serve as the basis of a cooperative file system.Department of Computer Science — University of San Francisco – p. 12/??Chord nodes•Each Chord node begins with an m bit identifer obtained byhashing its IP address.•Nodes are arranged in a circle modulo 2m.•Data is hashed with the same function (SHA-1) to producea key•A node is responsible for storing all data whose key isbetween its identifier and its successor’s identifier.Department of Computer Science — University of San Francisco – p. 13/??Example•Three nodes in thisexample, with identifiers0,1,3.•0 is responsible for datathat hashes to 0.•1 is responsible for datathat hashes to 1 or 2.•3 is responsible for datathat hashes to 3-7 .Department of Computer Science — University of San Francisco – p. 14/??Simple Routing•The simplest way to routing in Chord is as follows:•Each node keeps track of one other node: its successor.•To find a piece of data with key k, if a node does nothave k, it asks its successor.•Eventually, all nodes in the ring are queried.•Very simple, but requires N messages.Department of Computer Science — University of San Francisco – p. 15/??Actual Chord Routing•What Chord really does is have each node keep a fingertable•m entries in the table.•For node n, the ith entry is the first node that succeeds nby 2i−1modulo n.•Contains the Chord identifier and the actual IP address ofthe node.Department of Computer Science — University of San Francisco – p. 16/??Example01234567n +2i 1235s uccessor330n+2i1457s uccessor000n+2i 1s uccessor130124•In this case, each fingertable has three entries.•Indexes the nodes storingdata 1,2 and 4 keys away.•Each node only needs tocontain O(logN) entries.Department of Computer Science — University of San Francisco – p. 17/??Routing•To find a piece of data, a node computes the data’s hash,yielding its key.•Three possibilities:•The node is storing the data itself.•The key is in the seeking node’s finger table.•The key is not in the seeking node’s finger table.•If the key is not in the seeking node’s finge r table, weexecute find-predecessor.Department of Computer Science — University of San Francisco – p. 18/??Routing•Find-predecessor is meant to find the node that is beforethe data we are trying to find.•In the example, the predecessor of 5 is 3.•Each node sends a message to the node in its finger tablewho is closest to the data being so ug ht without going over.•If a message is received by a node n to find data with key k,and

View Full Document