Berkeley COMPSCI 294 - Lecture Notes - D2883942

Home> Schools> University of California, Berkeley> Computer Science (COMPSCI) > COMPSCI 294> Lecture Notes

DOC PREVIEW

Berkeley COMPSCI 294 - Lecture Notes

School name University of California, Berkeley

Course Compsci 294- Special Topics

Pages 9

This preview shows page 1-2-3 out of 9 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 9 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 9 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 9 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 9 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Efficient Search - OverviewImproving Search In Peer-to-Peer SystemsPresented By Jon Hesscs294-4 Fall 2003Efficient Search - Overview• Goals– Present alternative searching methods for systems with loose integrity constraints• Probabilistically optimize over loose constraints • Must not sacrifice user satisfaction– Measure hypothesized methods– Analyze the measurementsEfficient Search - Overview• Current peer-to-peer systems – Offer efficient search, but• Restrict data placement• Restrict search breadth– Provide data consistency – Provide data availabilityEfficient Search - Overview• Current file-sharing systems do not require– Top notch data availability or consistency– Search by identifier• So they do not need– Restrictive data, and pointer, placement• So they may– Make probabilistic optimizationsEfficient Search - Overview• Problem Overview– System should • Accept a query • Respond with a set of pointers to files– System is composed of an overlay of peers• Upon receiving a query, a peer – Processes the query– Responds with matches– Forwards query to neighborsEfficient Search - Overview• Cost Measurement– Processing Cost• Deciding which neighbors to route to• Checking local metadata for query matches.– Bandwidth Cost• Forwarding messages• Responding to queries– Calculate over a set of queries – Measure the aggregate network costEfficient Search - Overview• Quality Measurement– Size of result set• Maximum of the fraction of number of results with relation to some goal Z– Min ( Results / Z , 1.0 )• Latency from query to Zth, or final result– Min ( time( Z ) , time( final ) )Efficient Search - Methods• Current Technique– Gnutella - breadth first flood• Efficiencies– Maximizes number results– Minimizes latency• Inefficiencies– The same node may process the same query twice– Many more than Z results may be achievedEfficient Search - Methods• Current Technique– Freenet – serial depth first search• Efficiencies– Achieves exactly Z matches• Inefficiencies– Maximum Latency– No parallel searchingEfficient Search - Methods• Suggestions– Iterative deepening• Try not to locate more than Z matches– Directed breadth first flood• Don’t traverse wasteful links– Nearest neighbor • Amortize processing cost across memory and neighbors• Does not need to make final hop[s] to search furthest neighborsEfficient Search - Methods• Iterative Deepening– Successively send deeper reaching queries until Z is satisfied– Policy: set of depths {d0 … dn} and a time t– Send query to depth di • Nodes at depth difreezes query after processing• If Z is satisfied stop• If not, send a re-query message to nodes at diinstructing them to forward the query to di+1 Efficient Search - Methods• Iterative Deepening – {0, 2, 4, 5}, 3Holding Frozen Query Already Processed Query Unaware of QueryEfficient Search - Methods• Iterative Deepening – {0, 2, 4, 5}, 3Holding Frozen Query Already Processed Query Unaware of QueryEfficient Search - Methods• Iterative Deepening – {0, 2, 4, 5}, 3Holding Frozen Query Already Processed Query Unaware of QueryEfficient Search - Methods• Iterative Deepening – {0, 2, 4, 5}, 3Holding Frozen Query Already Processed Query Unaware of QueryEfficient Search - Methods• Directed Breadth First Search– Source Only sends queries to good neighbors– Good neighbors might have• Produced results in the past• Low latency• Lowest hop count for results– They have good neighbors• Highest traffic neighbors– They’re stable• Shortest message queue– Routed as normal BFS after first hopEfficient Search - Methods• Directed Breadth First SearchEfficient Search - Methods• Directed Breadth First SearchEfficient Search - Methods• Directed Breadth First SearchEfficient Search - Methods• Directed Breadth First SearchEfficient Search - Methods• Directed Breadth First SearchEfficient Search - Methods• Neighbor Index– Policy: Set of query depths {d0, …, dn} and index depth r.– Each node holds an index of neighbor’s metadata within n hops– Nodes propagate local metadata upon joining– Only nodes at depths diprocess queries• Other nodes simply forward until dnEfficient Search - Methods• Neighbor Index – {0, 3, 5}, 1Efficient Search - Methods• Neighbor Index – {0, 3, 5}, 1Efficient Search - Methods• Neighbor Index – {0, 3, 5}, 1Efficient Search - Methods• Neighbor Index – {0, 3, 5}, 1Efficient Search - Methods• Neighbor Index – {0, 3, 5}, 1Efficient Search - Methods• Neighbor Index – {0, 3, 5}, 1 Processor For Query Forwarder of Query Unaware of QueryEfficient Search - Experiment• Data Collection– Inject a custom client into the Gnutella network for 1 month– Collect statistics• Query relationships• Average files per user• Average length of messages• Many other statistics– Calculate the expected cost of the proposed methods from the harvested dataEfficient Search - Experiment• Iterative Deepening & Neighbor Index– For each Query q and Depth d• Send Q with TTL d– See how many matches nodes at each depth possess• Directed Breadth First Flood– Queries are sent to each neighbor once at a time, with a different query ID.Efficient Search - Results• Iterative deepening– Study the policies•P1 = {1, 2, 3, 4, 5, 6, 7}•P2 = {2, 3, 4, 5, 6, 7}•…•P6 ={6, 7}•P7 ={7}• For W = {1, 2, 4, 6, 150}Efficient Search - ResultsAs freeze time decreases, or policy number increases, Z is overshot.As freeze time decreases or policy number increases, we achieve Z faster.If all nodes used this policy would average satisfaction time decrease?Efficient Search - ResultsShortest average time to satisfactionobviously wins. Surprisingly, the greatest number of results method does very well.Greatest number of results, and shortest time to satisfaction do the best on the other benchmarks, but cost the most – obviously, results require bandwidth. This cost is still less than BFS.Efficient Search - ResultsQuery radius has exponential relationship to bandwidth cost.Only a 5MB Index for peers 5 hops away.If all nodes used this policy would average satisfaction time decrease?Efficient Search -

View Full Document

Berkeley COMPSCI 294 - Lecture Notes

Sign up for free to view:

This document and 3 million+ documents and flashcards
High quality study guides, lecture notes, practice exams
Course Packets handpicked by editors offering a comprehensive review of your courses
Better Grades Guaranteed


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-3 out of 9 pages.

Berkeley COMPSCI 294 - Lecture Notes

Sign up for free to view:

Please select your school