BU CS 565 - More on Rankings - D2982187

Home> Schools> Boston University> Computer Science (CS) > CS 565> More on Rankings

BU CS 565 - More on Rankings

Course Cs 565- Advanced Java Programming

Pages 98

Download Save

Unformatted text preview:

Slide 1Query-independent LARInDegree algorithmPageRank algorithm [BP98]Markov chainsRandom walksAn exampleState probability vectorAn exampleStationary distributionComputing the stationary distributionThe PageRank random walkThe PageRank random walkThe PageRank random walkThe PageRank random walkEffects of random jumpA PageRank algorithmRandom walks on undirected graphsResearch on PageRankTopic-sensitive pagerankTopic-sensitive PageRankTopic-sensitive PageRankPersonalization vectorTopic-sensitive PageRank: Overall approachTopic-sensitive PageRank: PreprocessingTopic-sensitive PageRank: Query-time processingSlide 27Comparing LAR vectorsDistance between LAR vectorsDistance between LAR vectorsOutlineRank AggregationExamplesExamplesExamplesVariants of the problemCombining scoresCombining scoresCombining scoresCombining scoresTop-kCost functionExampleFagin’s AlgorithmFagin’s AlgorithmFagin’s AlgorithmFagin’s AlgorithmFagin’s AlgorithmFagin’s AlgorithmFagin’s AlgorithmFagin’s AlgorithmFagin’s AlgorithmThreshold algorithmThreshold algorithmThreshold algorithmThreshold algorithmThreshold algorithmThreshold algorithmThreshold algorithmThreshold algorithmCombining rankingsThe problemVoting theoryWhat is a good voting system?Pairwise majority comparisonsPairwise majority comparisonsPairwise majority comparisonsPairwise majority comparisonsPairwise majority comparisonsPairwise majority comparisonsPairwise majority comparisonsPairwise majority comparisonsPlurality votePlurality with runoffPlurality with runoffPositive Association axiomBorda CountBorda CountBorda CountIndependence of Irrelevant AlternativesBorda CountVoting TheoryArrow’s Impossibility TheoremKemeny Optimal AggregationLocally Kemeny optimal aggregationLocally Kemeny optimal aggregationRank Aggregation algorithm [DKNS01]Spearman’s footrule distanceSpearman’s footrule aggregationExampleThe MedRank algorithmThe MedRank algorithmThe MedRank algorithmThe MedRank algorithmThe MedRank algorithmThe Spearman’s rank correlationExtensions and ApplicationsReferencesMore on RankingsQuery-independent LAR•Have an a-priori ordering of the web pages•Q: Set of pages that contain the keywords in the query q•Present the pages in Q ordered according to order π•What are the advantages of such an approach?InDegree algorithm•Rank pages according to in-degree–wi = |B(i)|1. Red Page2. Yellow Page3. Blue Page4. Purple Page5. Green Pagew=1w=1w=2w=3w=2PageRank algorithm [BP98]•Good authorities should be pointed by good authorities•Random walk on the web graph–pick a page at random–with probability 1- α jump to a random page–with probability α follow a random outgoing link•Rank according to the stationary distribution• 1. Red Page2. Purple Page 3. Yellow Page4. Blue Page5. Green Page nqFqPRpPRpq11)()()(Markov chains•A Markov chain describes a discrete time stochastic process over a set of states according to a transition probability matrix–Pij = probability of moving to state j when at state i•∑jPij = 1 (stochastic matrix)•Memorylessness property: The next state of the chain depends only at the current state and not on the past of the process (first order MC)–higher order MCs are also possibleS = {s1, s2, … sn}P = {Pij}Random walks•Random walks on graphs correspond to Markov Chains–The set of states S is the set of nodes of the graph G–The transition probability matrix is the probability that we follow an edge from one node to anotherAn examplev1v2v3v4v521000210031313100010100000021210P1000100111000101000000110AState probability vector•The vector qt = (qt1,qt2, … ,qtn) that stores the probability of being at state i at time t–q0i = the probability of starting from state iqt = qt-1 PAn example02100210031313100010100000021210Pv1v2v3v4v5qt+11 = 1/3 qt4 + 1/2 qt5qt+12 = 1/2 qt1 + qt3 + 1/3 qt4qt+13 = 1/2 qt1 + 1/3 qt4qt+14 = 1/2 qt5qt+15 = qt2Stationary distribution•A stationary distribution for a MC with transition matrix P, is a probability distribution π, such that π = πP•A MC has a unique stationary distribution if –it is irreducible•the underlying graph is strongly connected–it is aperiodic•for random walks, the underlying graph is not bipartite•The probability πi is the fraction of times that we visited state i as t → ∞•The stationary distribution is an eigenvector of matrix P–the principal left eigenvector of P – stochastic matrices have maximum eigenvalue 1Computing the stationary distribution•The Power Method–Initialize to some distribution q0–Iteratively compute qt = qt-1P–After enough iterations qt ≈ π–Power method because it computes qt = q0Pt•Rate of convergence–determined by λ2The PageRank random walk•Vanilla random walk–make the adjacency matrix stochastic and run a random walk02100210031313100010100000021210PThe PageRank random walk•What about sink nodes?–what happens when the random walk moves to a node without any outgoing inks?02100210031313100010000000021210P0210021003131310001051515151510021210P'The PageRank random walk•Replace these row vectors with a vector v–typically, the uniform vectorP’ = P + dvTotherwise0sink is i if1d515151515151515151515151515151515151515151515151512100021003131310001051515151510021210'P' )1(The PageRank random walk•How do we guarantee irreducibility?–add a random jump to vector v with prob α•typically, to a uniform vectorP’’ = αP’ + (1-α)uvT, where u is the vector of all 1sEffects of random jump•Guarantees irreducibility•Motivated by the concept of random surfer•Offers additional flexibility –personalization–anti-spam•Controls the rate of convergence–the second eigenvalue of matrix P’’ is αA PageRank algorithm•Performing vanilla power method is now too expensive – the matrix is not sparseq0 = vt = 1repeat t = t +1until δ < ε 1tTtq'P'q1ttqqδEfficient computation of y = (P’’)T xβvyyyx βxαPy11TRandom walks on undirected graphs•In the

View Full Document


School:
Email:
New Password:
Confirm Password:

BU CS 565 - More on Rankings

Sign up for free to view:

Please select your school