DOC PREVIEW
CU-Boulder CSCI 5417 - Lecture 20

This preview shows page 1-2-3-22-23-24-45-46-47 out of 47 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 47 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 47 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 47 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 47 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 47 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 47 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 47 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 47 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 47 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 47 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1 CSCI 5417 Information Retrieval Systems Jim Martin!Lecture 20 11/3/2011 Today  Finish PageRank  HITs  Start ML-based ranking2 11/11/11 CSCI 5417 - IR 3 PageRank Sketch  The pagerank of a page is based on the pagerank of the pages that point at it.  Roughly € Pr(P) =Pr(in)V (in)in∈P∑11/11/11 CSCI 5417 - IR 4 PageRank scoring  Imagine a browser doing a random walk on web pages:  Start at a random page  At each step, go out of the current page along one of the links on that page, equiprobably  “In the steady state” each page has a long-term visit rate - use this as the page’s score  Pages with low rank are pages rarely visited during a random walk 1/3 1/3 1/33 11/11/11 CSCI 5417 - IR 5 Not quite enough  The web is full of dead-ends. Pages that are pointed to but have no outgoing links  Random walk can get stuck in such dead-ends  Makes no sense to talk about long-term visit rates in the presence of dead-ends. ?? 11/11/11 CSCI 5417 - IR 6 Teleporting  At a dead end, jump to a random web page  At any non-dead end, with probability 10%, jump to a random web page  With remaining probability (90%), go out on a random link.  10% - a parameter (call it alpha)4 11/11/11 CSCI 5417 - IR 7 Result of teleporting  Now you can’t get stuck locally.  There is a long-term rate at which any page is visited  How do we compute this visit rate?  Can’t directly use the random walk metaphor State Transition Probabilities We’re going to use the notion of a transition probability. If we’re in some particular state, what is the probability of going to some other particular state from there. If there are n states (pages) then we need an n x n table of probabilities. 11/11/11 CSCI 5417 - IR 85 Markov Chains  So if I’m in a particular state (say the start of a random walk)  And I know the whole n x n table  Then I can compute the probability distribution over all the next states I might be in in the next step of the walk...  And in the step after that  And the step after that 11/11/11 CSCI 5417 - IR 9  Say alpha = .5 Example 11/11/11 CSCI 5417 - IR 10 1 3 26  Say alpha = .5 Example 11/11/11 CSCI 5417 - IR 11 1 3 2 ? P(32)  Say alpha = .5 Example 11/11/11 CSCI 5417 - IR 12 1 3 2 2/3 P(32)7  Say alpha = .5 Example 11/11/11 CSCI 5417 - IR 13 1 3 2 1/6 2/3 1/6 P(3*)  Say alpha = .5 Example 11/11/11 CSCI 5417 - IR 14 1 3 2 1/6 2/3 1/6 5/12 1/6 5/12 1/6 2/3 1/68  Say alpha = .5 Example 11/11/11 CSCI 5417 - IR 15 1 3 2 1/6 2/3 1/6 5/12 1/6 5/12 1/6 2/3 1/6 Assume we start a walk in 1 at time T0. Then what should we believe about the state of affairs in T1? What should we believe about things at T2? Example 11/11/11 CSCI 5417 - IR 16 PageRank values9 17#More Formally  A probability (row) vector x = (x1 , ..., xN) tells us where the random walk is at any point.  Example:  More generally: the random walk is on the page i with probability xi.  Example:  Σ xi = 1"( 0 0 0 … 1 … 0 0 0 ) 1 2 3 … i … N-2 N-1 N ( 0.05 0.01 0.0 … 0.2 … 0.01 0.05 0.03 ) 1 2 3 … i … N-2 N-1 N 18#Change##in##probability##vector# If#the#probability#vector#is#x"= (x1 , ..., xN),"at#this#step,#what#is#it#at#the#next#step?#10 19#Change##in##probability##vector# If#the#probability#vector#is#x"="(x1 , ..., xN),"at#this#step,#what#is#it#at#the#next#step?# Recall#that#row#i"#of#the#transi@on#probability#matrix#P"tells#us#where#we#go#next#from#state#i.#20#Change##in##probability##vector# If#the#probability#vector#is#x"="(x1","...,"xN),"at#this#step,#what#is#it#at#the#next#step?# Recall#that#row#i"#of#the#transi@on#probability#matrix#P"tells#us#where#we#go#next#from#state#i." So#from#x,#our#next#state#is#distributed#as#xP.#11 21#Steady#state#in#vector#nota@on#22#Steady#state#in#vector#nota@on# The#steady#state#in#vector#nota@on#is#simply#a#vector#"""""π#=#(π1,#π2,#…,#πΝ)#of#probabili@es.#12 23#Steady#state#in#vector#nota@on# The#steady#state#in#vector#nota@on#is#simply#a#vector#"""""π#=#(π1,#π2,#…,#πΝ)#of#probabili@es.# Use#π#to#dis@nguish#it#from#the#nota@on#for#the#probability#vector#x.)"24#Steady#state#in#vector#nota@on# The#steady#state#in#vector#nota@on#is#simply#a#vector#"""""π#=#(π1,#π2,#…,#πΝ)#of#probabili@es.# Use#π#to#dis@nguish#it#from#the#nota@on#for#the#probability#vector#x.)" π#is#the#longNterm#visit#rate#(or#PageRank)#of#page#i.#13 25#Steady#state#in#vector#nota@on# The#steady#state#in#vector#nota@on#is#simply#a#vector#"""""π#=#(π1,#π2,#…,#πΝ)#of#probabili@es.# Use#π#to#dis@nguish#it#from#the#nota@on#for#the#probability#vector#x.)" π#is#the#longNterm#visit#rate#(or#PageRank)#of#page#i.# So#we#can#think#of#PageRank#as#a#very#long#vector#–#one#entry#per#page.#26#SteadyNstate#distribu@on:#Example#14 27#SteadyNstate#distribu@on:#Example# What#is#the#PageRank#/#steady#state#in#this#example?#28#SteadyNstate#distribu@on:#Example#15 29#SteadyNstate#distribu@on:#Example#x1 Pt(d1) x2 Pt(d2) P11 = 0.25 P21 = 0.25 P12 = 0.75 P22 = 0.75 t0 t1 0.25 0.75 Pt(d1) = Pt-1(d1) * P11 + Pt-1(d2) * P21 Pt(d2) = Pt-1(d1) * P12 + Pt-1(d2) * P22 30#SteadyNstate#distribu@on:#Example#x1 Pt(d1) x2 Pt(d2) P11 = 0.25 P21 = 0.25 P12 = 0.75 P22 = 0.75 t0 t1 0.25 0.75 0.25 0.7516 31#SteadyNstate#distribu@on:#Example#x1 Pt(d1) x2 Pt(d2) P11 = 0.25 P21 = 0.25 P12 = 0.75 P22 = 0.75 t0 t1 0.25 0.25 0.75 0.75 0.25 0.75 PageRank vector = π = (π1,#π2) = (0.25, 0.75) Pt(d1) = Pt-1(d1) * P11 + Pt-1(d2) * P21 Pt(d2) = Pt-1(d1) * P12 + Pt-1(d2) * P22 32#SteadyNstate#distribu@on:#Example#x1 Pt(d1) x2 Pt(d2) P11 = 0.25 P21 = 0.25 P12 = 0.75 P22 = 0.75 t0 t1 0.25 0.25 0.75 0.75 0.25 0.75 (convergence) PageRank vector = π = (π1,#π2) = (0.25, 0.75) Pt(d1) = Pt-1(d1) * P11 + Pt-1(d2) * P21 Pt(d2) = Pt-1(d1) * P12 + Pt-1(d2) * P2217 33#How#do#we#compute#the#steady#state#vector?#34#How#do#we#compute#the#steady#state#vector?# In#other#words:#how#do#we#compute#PageRank?#18 35#How#do#we#compute#the#steady#state#vector?# In#other#words:#how#do#we#compute#PageRank?# Recall:#π#=#(π1,#π2,#…,#πN)#is#the#PageRank##vector,#the#vector#of#steadyNstate#probabili@es#...#36#How#do#we#compute#the#steady#state#vector?#


View Full Document

CU-Boulder CSCI 5417 - Lecture 20

Download Lecture 20
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 20 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 20 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?