� Markov Chains Markov chains: • Powerful tool for sampling from complicated distributions • rely only on local moves to explore state space. • Many use Markov chains to model events that arise in nature. • We create Markov chains to explore and sample from problems. 2SAT: • Fix some assignment A • let f(k) be expected time to get all n variables to match A if n currently match. • Then f(n) = 0, f(0) = 1 + f(1), and f(k) = 1 + 1 2 (f(k + 1) + f(k − 1). • Rewrite: f(0) − f(1) = 1 and f(k) − f(k + 1) = 2 + f(k − 1) − f(k) • So f(k) − f (k + 1) = 2k + 1 • deduce f(0) = 1 + 3 + · · · + (2n − 1) = n2 • so, find with probability 1/2 in 2n2 time. • With high probability, find in O(n2 log n) time. More general formulation: Markov chain • State space S • markov chain begins in a start state X0, moves from state to state, so output of chain is a sequence of states X0, X1, . . . = {Xt}∞t=0 • movement controlled by matrix of transition probabilities pij = probability next state will be j given current is i. • thus, j pij = 1 for every i ∈ S • implicit in definition is memorylessness property: Pr[Xt+1 = j | X0 = i0, X1 = i1, . . . , Xt = i] = Pr[Xt+1 = j Xt = i] = pij .| • Initial state X0 can come from any probability distribution, or might be fixed (trivial prob. dist.) • Dist for X0 leads to dist over sequences {Xt} • Suppose Xt has distribution q (vector, qi is prob. of state i). Then Xt+1 has dist qP . Why? 1• Observe Pr[Xt+r = j | Xt = i] = P r ij Graph of MC: • Vertex for every state • Edge (i, j) if pij > 0 • Edge weight pij • weighted outdegree 1 • Possible state sequences are paths through the graph Stationary distribution: a π such that πP = π• • left eigenvector, eigenvalue 1 • steady state behavior of chain: if in stationary, stay there. • note stationary distribution is a sample from state space, so if can get right stationary distribution, can sample lots of chains have them. • • to say which, need definitions. Things to rule out: • infinite directed line (no stationary) • 2-cycle (no stationary) • disconnected graph (multiple) Irreducibility • any state can each any other state • i.e. path between any two states • i.e. single strong component in graph Persistence/Transience: (t) • is probability first hit state j at t, given start state i.rij • fij is probability eventually reach j from i, so � r(t) ij � (t) • expected time to reach is hitting time hij = trij 2• If fij < 1 then hij = ∞ since might never reach. Converse not always true. • If fii < 1, state is transient. Else persistent. If hii = ∞, null persistent. Persistence in finite graphs: • graph has strong components • final strong component has no outgoing edges • Nonfinal components: – once leave nonfinal component, cannot return – if nonfinal, nonzero probability of leaving in n steps. – so guaranteed to leave eventually – so, vertices in nonfinal components are transient • Final components – if final, will stay in that component – If two vertices in same strong component, have path between them – so nonzero probability of reaching in (say) n steps. – so, vertices in final components are persistent – geometric distribution on time to reach, so expected time finite. Not null-persistent Conclusion: – In finite chain, no null-persistent states – In finite irreducible chain, all states non-null persistent (no transient states) Periodicity: • Periodicity of a state is max T such that some state only has nonzero probability at times a + Ti for integer i • Chain periodic if some state has periodicity > 1 • In graph, all cycles containing state have length multiple of T • Easy to eliminate: add self loops • slows down chain, otherwise same Ergodic: • aperiodic and non-null persistent • means might be in state at any time in (sufficiently far) future 3Fundamental Theorem of Markov chains: Any irreducible, finite, aperiodic Markov chain satisfies: • All states ergodic (reachable at any time in future) • unique stationary distribution π, with all πi > 0 • fii = 1 and hii = 1/πi • number of times visit i in t steps approaches tπi in limit of t. Justify all except uniqueness here. Finite irreducible aperiodic implies ergodic (since finite irreducible implies non-null persis- tent) Intuitions for quantities: • hii is expected return time • So hit every 1/hii steps on average • So hii = 1/πi • If in stationary dist, tπi visits follows from linearity of expectation Random walks on undirected graphs: • general Markov chains are directed graphs. But undirected have some very nice prop-erties. • take a connected, non-bipartite undirected graph on n vertices states are vertices. • • move to uniformly chosen neighber. • So puv = 1/d(u) for every neighbor v • stationary distribution: πv = d(v)/2m • unqiqueness says this is only one • deduce hvv = 2m/d(v) Definitions: • Hitting time huv is expected time to reach u from v • commute time is huv + hvu • Cu(G) is expected time to visit all vertices of G, starting at u • cover time is maxu Cu(G) (so in fact is max over any starting distribution). 4• let’s analyze max cover time Examples: • clique: commute time n, cover time Θ(n log n) • line: commute time between ends is Θ(n2) • lollipop: huv = Θ(n3) while hvu = Θ(n2) (big difference!) • also note: lollipop has edges added to line, but higher cover time: adding edges can increase cover time even though improves connectivity. general graphs: adjacent vertices: • lemma: for adjcaent (u, v), huv + hvu ≤ 2m • proof: new markov chain on edge traversed following vertex MC – transition matrix is doubly stochastic: column sums are 1 (exactly d(v) edges can transit to edge (v, w), each does so with probability 1/d(v)) – In homework, show such matrices have uniform stationary distribution. – Deduce πe = 1/2m. Thus hee = 2m. • So consider suppose original chain on vertex v. – suppose arrived via (u, v) – expected to traverse (u, v) again in 2m steps – at this point will have commuted u to v and back. – so conditioning on arrival method, commute time 2m (thanks to memorylessness) General graph cover time: • theorem: cover time O(mn) • proof: find a
View Full Document