Probablistic Graphical Models, Spring 2007Homework 2Due at the beginni ng of class on 10/17/07InstructionsThere are six questions in this homework. The last question involves some programming which should bedone in MATLAB. Do not attach your code to the writeup. Make a tarball of your code and output, nameit <userid>.tgz where your userid is your CS or Andrew id, and copy it to /afs/cs/academic/class/10708-f07/hw2/<userid> .If you are not submitting this from an SCS machine, you might need to authenticate yourself first. Seehttp://www.cs.cmu.edu/∼help/afs/cross realm.html for instructions. If you are not a CMU studentand don’t have a CS or Andrew id, email your submission to [email protected] mu.edu.You are allowed to use any of (and only) the material distributed in class for the homeworks. This includesthe slides and the handouts given in the class1. Refer to the web page for policies regarding collaboration,due dates, and extensions.1 [20 pts] Markov NetworksWe define the following properties for a set of conditional independencies• Strong Union:(X⊥Y |Z) ⇒ (X⊥Y |Z, W )In other words, additional evidence cannot induce dependence.• Transitivity: For all disjoint X, Y, Z and variable A,¬(X⊥A|Z)&¬(A⊥Y |Z) ⇒ ¬(X⊥Y |Z)Intuitively, this statement asserts that if X are both correlated with A (given Z), then they are alsocorrelated with each other(given Z).1. Construct a simple BN G such that I(G) does not satisfy both of the above properties.2. Prove that if I(H) is the set of independencies of a Markov Network H, then I(H) satisfies the aboveproperties.3. Let Il(H) = {(X⊥X − X − NH(X)|NH(X)) : X ∈ X }, where NH(X) is the set of neighbors of X inH, be the set of local Markov independencies associated with H and Ip(H) = {(X⊥Y |X − {X, Y }) :X − Y /∈ H} be the set of pairwise Markov independencies associated with H. Prove the followingassertion about Markov Networks: Il(H) ⇒ Ip(H).1Please contact Monica Hopes(meh@cs)if you need a copy of a handout.12 [10 pts] Constructing Junct ion TreeTo construct a Junction Tree from a chordal graph H, we build an undirected graph whose nodes are themaximal cliques in H where every pair of nodes Ci, Cjis connected by an edge whose weight is |Ci∩ Cj|.We then use the maximum weight spanning tree algorithm to find a tree in this graph whose weight i.e. thesum of the weights of the edges in the graph is maximal. (See Section 10.4.2 of the handout from Koller andFriedman for more details)Show that the Junction tree constructed using the maximum weight spanning tree procedure outlinedabove satisfies the running intersection property.3 [15 pts] Modifying Junction TreeLet T be a calibrated junction tree representing the unnormalized distribution PF=Qφ∈Fφ. Let φ0be anew factor and let PF0= PF× φ0. Let Cibe some clique such that Scope[φ] ∈ Ci. Show that we can obtainP0Ffor any clique Cjby multiypling φ0into πiand then propagating messages from Cito Cjalong the pathbetween them.4 [10 pts]Di stri buti ve LawsWe saw in class that the sum-product and max-product versions of BP(and of Shafer-Shenoy) are virtuallyidentical with the exception of a summation in sum-product being replaced by a maximization in max-product(leading to their names).Now suppose, we wish to compute the least probable estimates instead of the most probable estimates;i.e. we wish to compute minxp(x)Show that a similar modification lets you compute this value. You need to prove the correctness of youralgorithm (you may use the correctness of max-product in your proof).5 [15 pts ]Dirichlet-multinomial predictionLet θ ∼ Dir(α). Consider multinomial random variables (X1, X2, ..., XN), where Xn∼ Mult(θ) for eachn, and where the Xnare assumed conditionally independent given θ. Now consider a random variableXnew∼ Mult(θ) that is assumed conditionally independent of (X1, X2, ..., XN) given θ.Compute P (xnew|x1, x2, ..., xN), by integrating over θ6 [30 pts] Linear RegressionThe data file lms.dat contains twenty rows each with three columns of numbers. The first two columns arethe c omponents of an input vector x and the last column is an output value y.1. Solve the normal equations for this data to find the optimal value of the parameter vector.2. Find eigenvectors and eigenvalues of the covariance matrix of the input vectors and plot the contoursof the cost function J in the parameter space.3. Initializing the LMS algorithm at θ = 0, plot the path taken in the parameter space by the algorithmfor three different values of the step size ρ. In particular, let ρ equal the inverse of the maximum eigenvalue of the covariance matrix, one-twentieth that value, and one-fortieth that value.Submit whatever code you have written for this question, along with a README briefly describing yourcode and submit them to the homework
View Full Document