Unformatted text preview:

LECTURE 4 Last time: Types of convergence • Weak Law of Large Numbers • Strong Law of Large Numbers • Asymptotic Equipartition Property • Lecture outline Stochastic processes • Markov chains • Entropy rate • Random walks on graphs • Hidden Markov models • Reading: Chapter 4.Stochastic processes A stochastic process is an indexed sequence or r.v.s X0, X1, . . . characterized by the joint PMF PX0,X1,...,Xn (x0, x1, . . . , xn), (x0, x1, . . . , xn) ∈ X n for n = 0, 1, . . .. A stochastic process is stationary if PX0,X1,...,Xn (x0, x1, . . . , xn) = PXl,Xl+1,...,Xl+n (x0, x1, . . . , xn) for every shift l and all (x0, x1, . . . , xn) ∈ X n .Stochastic processes A discrete stochastic process is a Markov chain if PXn|X0,...,Xn−1 �xn|x0, . . . , xn−1� = PXn|Xn−1 �xn|xn−1� for n = 1, 2, . . . and all (x0, x1, . . . , xn) ∈ X n . We deal with time invariant Markov chains Xn: state after n transitions – belongs to a finite set, e.g., {1, . . . , m} – X0 is either given or random (given current state, the past does not mat-ter) pi,j = P(Xn+1 = j | Xn = i) = P(Xn+1 = j | Xn = i, Xn−1, . . . , X0) Markov chain is characterized by probability transition matrix P = [pi,j]Review of Markov chains State occupancy probabilities, given initial state i: ri,j(n) = P(Xn = j X0= i)| Key recursion: mri,j(n) = � ri,k(n − 1)pk,j k=1 With random initial state: mP(Xn = j) = � P(X0= i)ri,j(n) i=1 Does rij converge to something? Does the limit depend on initial state?Review of Markov chains Recurrent and transient states. State i is recurrent if: starting from i, and from wherever you can go, there is a way of returning to i. If not recurrent, called transient. Recurrent class collection of re-current states that “communicate” to each other and to no other state. A recurrent state is periodic if: there is an integer d > 1 such that ri,i(k) = 0 when k is not an integer multiple of d Assume a single class of recurrent states, aperiodic. Then, lim ri,j(n) = πjn→∞ where πj does not depend on the initial conditions lim P(Xn = j X0) = πjn→∞ |π1, . . . , πm can be found as the unique• solution of the balance equations πj = � πkpk,j k together with � πj = 1 jEntropy rate The entropy rate of a stochastic process is 1 lim H(Xn) nn→∞ if it exists For a stationary stochastic process, the en-tropy rate exists and is equal to lim H(XnXn−1)n→∞ |since conditioning decreases entropy and by stationarity, it holds that H(Xn+1|Xn) ≤ H(Xn+1|X2n) = H(Xn|Xn−1) so it reaches a limit (decreasing non-negative sequence) Chain rule 1 1 nnH(Xn) = n � H(Xi|Xi−1) i=1 since the elements in the sum on the RHS reach a limit, that is the limit of the LHSEntropy rate Markov chain entropy rate: lim H(XnXn−1)n→∞ |= lim H(XnXn−1)n→∞ |= H(X2X1) = − � p|i,jπi log(pi,j) i,jRandom walk on graph Consider undirected graph G = (N , E, W) where N , E, W are the nodes, edges and weights. With each edge there is an as-sociated edge weight Wi,j =Wi,j Wj,i Wi = � Wi,j j W = � Wi,j i,j:j>i 2W = � Wi iRandom walk on graph We call a random walk the Markov chain in which the states are the nodes of the graph Wi,jpi,j = Wi Wi πi = 2W Check: �i πi = 1 and � i πipi,j = � i Wi 2W Wi,j Wi = � Wi,j i 2W = Wj 2W = πjRandom walk on graph H(X2X1) = − � π|i � pi,j log(pi,j) i j = − � Wi � Wi,j log �Wi,j � i 2WjWi Wi = − � Wi,j log �Wi,j � 2W Wi i,j = − � Wi,j log �Wi,j � 2W 2W i,j + � Wi,j log � Wi � 2W 2Wi,j = Wi,j log �Wi,j � + � Wi log � Wi � − � 2W 2W 2W 2Wi,j i Entropy rate is difference of two entropies Note: time reversibility for Markov chain that can be represented as random walk on undirected weighted graphHidden Markov models Consider an ALOHA wireless model M users sharing the same radio channel to transmit packets to a base station During each time slot, a packet arrives to a user’s queue with probability p, indepen-dently of the other M − 1 users Also, at the beginning of each time slot, if a user has at least one packet in its queue, it will transmit a packet with probability q, independently of all other users If two packets collide at the receiver, they are not successfully transmitted and remain in their respective queues�Hidden Markov models Let Xi = (N[1]i, N[2]i, . . . , N[M] ) denote the random vector at time i where N[m]i is the number of packets that are in user m’s queue at time i. Xi is a Markov chain. Consider the random vector Yi = (Z[1], Z[2], . . . , Z[M]) where Z[m]i = 1 if user m transmits during time slot i and Z[i] = 0 otherwise Is Yi Markov?Hidden Markov processes Let Xi, X2, . . . be a stationary Markov chain and let Yi = φ(Xi) be a process, each term of which is a function of the corresponding state in the Markov chain Y1, Y2, . . . form a hidden Markov chain, which is not always a Markov chain, but is still stationary What is its entropy rate?� Hidden Markov processes We suspect that the effect of initial infor-mation should decay H(Yn|Yn−1)−H(Yn|Yn−1, X1) = I(X1; Yn|Yn−1) should go to 0 Indeed, nlim I (X1; Yn) = lim � I �X1; YiYi−1� n→∞ n→∞ i=1 |= ∞I �X1; Yi |Yi−1� i=1 since we have an infinite sum in which the terms are non-negative and which is upper bounded by H(X1), the terms must tend to 0MIT OpenCourseWarehttp://ocw.mit.edu 6.441 Information Theory Spring 2010 For information about citing these materials or our Terms of Use, visit:


View Full Document

MIT 6 441 - LECTURE NOTES

Download LECTURE NOTES
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view LECTURE NOTES and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view LECTURE NOTES 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?