Unformatted text preview:

Networks: Lecture 76.207/14.15: Networks Lecture 7: Search on Networks: Navigation and Web Search Daron Acemoglu and Asu Ozdaglar MIT September 30, 2009 1Networks: Lecture 7 Outline Navigation (or decentralized search) in networks Web search Hubs and authorities: HITS algorithm PageRank algorithm Reading: EK, Chapter 20.3-20.6 (navigation) EK, Chapter 14 (web search) (Optional reading:) “The Small-World Phenomenon: An Algorithmic Perspective,” by Jon Kleinberg, in Proc. of ACM Symposium on Theory of Computing, pp./ 163-170, 2000. 2Networks: Lecture 7 Introduction We have studied random graph models of network structure. Static random graph models (Erd¨os-Renyi model, configuration model, small-world model) Dynamic random graph models (preferential attachment model) In the next two lectures, we will study processes taking place on networks. In particular, we will focus on: Navigation or decentralized search in networks Web search: ranking web pages Spread of epidemics 3Networks: Lecture 7 Decentralized Search Recall Milgram’s small-world experiment, where the goal was to find short chains of acquaintances (short paths) linking arbitrary pairs of people in the US. A source person in Nebraska is asked to deliver a letter to a target person in Massachusetts. This will be done through a chain where each person forwards the letter to someone he knows on a first-name basis. Over many trials, the average number of intermediate steps in successful chains was found to lie between 5 and 6, leading to six degrees of separation principle. Milgram’s experiment has two fundamentally surprising discoveries. First is that such short paths exist in networks of acquaintances. The small-world model proposed by Watts and Strogatz (WS) was aimed at capturing two fundamental properties of networks: short paths and high clustering. 4Networks: Lecture 7 Decentralized Search Second, people are able to find the short paths to the designated target with only local information about the network. If everybody knows the global network structure or if we can “flood the network” (i.e., everyone will send the letter to all their friends), we would be able to find the short paths efficiently. With local information, even if the social network has short paths, it is not clear that such decentralized search will be able to find them efficiently. a g j l m n o i h f e d c bp Image by MIT OpenCourseWare. Adapted from Easley, David, and Jon Kleinberg. Networks, Crowds, and Markets: Reasoning about a Highly Connected World. New York, NY: Cambridge University Press, 2010. ISBN: 9780521195331. Figure: In myopic search, the current message-holder chooses the contact that is closest to the target and forwards the message to it. 5Networks: Lecture 7 A Model for Decentralized Search—1 Kleinberg introduces a simple framework that encapsulates the paradigm of WS – rich in local connections with a few long range links. The starting point is an n × n two-dimensional grid with directed edges (instead of an undirected ring). The nodes are identified with the lattice points, i.e., a node v is identified with the lattice point (i, j) with i, j ∈ {1, . . . , n}. For any two nodes v and w, we define the distance between them d(v, w ) as the number of grid steps between them, d((i, j), (k, l)) = |k − i| + |l − j|. Each node is connected to its 4 local neighbors directly – his local contacts. Image by MIT OpenCourseWare. Adapted from Easley, David, and Jon Kleinberg. Networks, Crowds, and Markets: Reasoning about a Highly Connected World. New York, NY: Cambridge University Press, 2010. ISBN: 9780521195331. Each node also has a random edge to another node – his long range contact. Figure: A grid network with n = 6 and the contacts of a node u. 6Networks: Lecture 7 A Model for Decentralized Search—2 The model has a parameter that controls the “scales spanned by the long-range weak ties.” The random edge is generated in a way that decays with distance, controlled by a clustering exponent α: In generating a random edge out of v, we have this edge link to w with probability proportional to d(v, w )−α . When α = 0, we have the uniform distribution over long-range contacts – the distribution used in the model of WS. As α increases, the long-range contact of a node becomes more clustered in its vicinity on the grid. Image by MIT OpenCourseWare. Adapted from Easley, David, and Jon Kleinberg. Networks, Crowds, and Markets: Reasoning about a Highly Connected World. New York, NY: Cambridge University Press, 2010. ISBN: 9780521195331. Figure: (left) Small clustering exponent, (right) large clustering exponent 7Networks: Lecture 7 Decentralized Search in this Model We evaluate different search procedures according to their expected delivery time– the expected number of steps required to reach the target (where the expectation is over a randomly generated set of long-range contacts and randomly chosen starting and target nodes). Given this set-up, we will prove that decentralized search in WS model will necessarily require a large number of steps to reach the target (much larger than the true length of the shortest path). As a model, WS network is effective in capturing clustering and existence of short paths, but not the ability of people to actually find those paths. The problem here is that the weak ties that make the world small are “too random” in this model. The parameter α captures a tradeoff between how uniform the random links are. Question: Is there an optimal α (or network structure) that allows for rapid decentralized search? 8Networks: Lecture 7 Efficiency of Decentralized Search Theorem (Kleinberg 2000) Assume that each node only knows his set of local contacts, the location of his long-range contact, and the location of the target (crucially, he does not know the long-range contacts of the subsequent nodes). Then: (a) For 0 ≤ α < 2, the expected delivery time of any decentralized algorithm is at least βαn(2−α)/3 for some constant βα. (b) For α = 2, there is a decentralized algorithm so that the expected delivery time is at most β2(log ( n))2 for some constant β2. (c) For 2 < α < 3, the expected delivery time of any decentralized algorithm is at least βαn(α−2) for some constant βα. Decentralized search with α = 2 requires time that grows like a polynomial in log(n), while any other exponent requires time that grows like a


View Full Document

MIT 6 207 - Search on Networks

Download Search on Networks
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Search on Networks and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Search on Networks 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?