CS 188Spring 2011Introduction toArtificial Intelligence Final ExamINSTRUCTIONS• You have 3 hours.• The exam is closed book, closed notes except a two-page crib sheet. Non-programmable calculators only.• Mark your answers ON THE EXAM ITSELF. If not sure of your answer you may wish to provide a briefexplanation.First nameLast nameSIDLoginName of the person to your leftName of the person to your rightFor staff use only:Q1. The OMNIBUS /24Q2. HMM: Where is the key? /21Q3. Sampling /9Q4. Worst-Case Markov Decision Processes /12Q5. Tree-Augmented Naive Bayes /19Q6. Finding Working Kernels /10Q7. Learning a Ranking for Twoogle Hiring /10Total /1051Q1. [24 pts] The OMNIBUSEach question is worth 1 point. Leaving a question blank is worth 0 points. Answering a multiple choicequestion with k possible choices incorrectly is worth −1/(k − 1) points (so −1 points for true/falsequestions, −1/2 for questions with three options, etc.). This gives you an expected value of 0 forrandom guessing.(a) [1 pt] CS 188Circle the best motto for AI.1. Maximize your expected utilities.(b) [5 pts] Search(i) [true or false] Uniform-cost search will never expand more nodes than A*-search.(ii) [true or false] Depth-first search will always expand more nodes than breadth-first search.(iii) [true or false] The heuristic h(n) = 0 is admissible for every search problem.(iv) [true or false] The heuristic h(n) = 1 is admissible for every search problem.(v) [true or false] The heuristic h(n) = c(n), where c(n) is the true cheapest cost to get from the node n to agoal state, is admissible for every search problem.(c) [2 pts] CSPs(i) [true or false] The most-constrained variable heuristic provides a way to select the next variable to assignin a backtracking search for solving a CSP.(ii) [true or false] By using the most-constrained variable heuristic and the least-constraining value heuristicwe can solve every CSP in time linear in the number of variables.(d) [3 pts] Games(i) [true or false] When using alpha-beta pruning, it is possible to get an incorrect value at the root node bychoosing a bad ordering when expanding children.(ii) [true or false] When using alpha-beta pruning, the computational savings are independent of the order inwhich children are expanded.(iii) [true or false] When using expectimax to compute a policy, re-scaling the values of all the leaf nodes bymultiplying them all with 10 can result in a different policy being optimal.(e) [3 pts] MDPs For this question, assume that the MDP has a finite number of states.(i) [true or false] For an MDP (S, A, T, γ, R) if we only change the reward function R the optimal policy isguaranteed to remain the same.(ii) [true or false] Value iteration is guaranteed to converge if the discount factor (γ) satisfies 0 < γ < 1.(iii) [true or false] Policies found by value iteration are superior to policies found by policy iteration.(f) [2 pts] Reinforcement Learning(i) [true or false] Q-learning can learn the optimal Q-function Q∗without ever executing the optimal policy.(ii) [true or false] If an MDP has a transition model T that assigns non-zero probability for all triples T(s, a, s0)then Q-learning will fail.2(g) [8 pts] Bayes’ Nets For each of the conditional independence assertions given below, circle whether they areguaranteed to be true, guaranteed to be false, or cannot be determined for the given Bayes’ net.ABCHGEFDB ⊥⊥ C Guaranteed true Guaranteed false Cannot be determinedB ⊥⊥ C | G Guaranteed true Guaranteed false Cannot be determinedB ⊥⊥ C | H Guaranteed true Guaranteed false Cannot be determinedA ⊥⊥ D | G Guaranteed true Guaranteed false Cannot be determinedA ⊥⊥ D | H Guaranteed true Guaranteed false Cannot be determinedB ⊥⊥ C | A, F Guaranteed true Guaranteed false Cannot be determinedF ⊥⊥ B | D, A Guaranteed true Guaranteed false Cannot be determinedF ⊥⊥ B | D, C Guaranteed true Guaranteed false Cannot be determined3Q2. [21 pts] HMM: Where is the key?The cs188 staff have a key to the homework bin. It is the master key that unlocks the bins to many classes, so wetake special care to protect it.Every day John Duchi goes to the gym, and on the days he has the key, 60% of the time he forgets it next to thebench press. When that happens one of the other three GSIs, equally likely, always finds it since they work out rightafter. Jon Barron likes to hang out at Brewed Awakening and 50% of the time he is there with the key, he forgetsthe key at the coffee shop. Luckily Lubomir always shows up there and finds the key whenever Jon Barron forgets it.Lubomir has a hole in his pocket and ends up losing the key 80% of the time somewhere on Euclid street. However,Arjun takes the same path to Soda and always finds the key. Arjun has a 10% chance to lose the key somewhere inthe AI lab next to the Willow Garage robot, but then Lubomir picks it up.The GSIs lose the key at most once per day, around noon (after losing it they become extra careful for the rest ofthe day), and they always find it the same day in the early afternoon.(a) [2 pts] Draw on the left the Markov chain capturing the location of the key and fill in the transition probabilitytable on the right. In this table, the entry of row JD and column JD corresponds to P (Xt+1= JD|Xt= JD),the entry of row JD and column JB corresponds to P (Xt+1= JB|Xt= JD), and so forth.JD JB LB ASJDJBLBAS 0.10Monday early morning Prof. Abbeel handed the key to Jon Barron. (The initial state distribution assigns probability1 to X0= JB and probability 0 to all other states.)(b) [4 pts] The homework is due Tuesday at midnight so the GSIs need the key to open the bin. What is the prob-ability for each GSI to have the key at that time? Let X0, XMonand XTuebe random variables correspondingto who has the key when Prof. Abbeel hands it out, who has the key on Monday evening, and who has the keyon Tuesday evening, respectively. Fill in the probabilities in the table below.P (X0)P (XMon) P (XTue)JD 0JB 1LB 0AS 04(c) [3 pts] The GSIs like their jobs so much that they decide to be professional GSIs permanently. They assign anextra credit homework (make computers truly understand natural language) due at the end of time. What isthe probability that each GSI holds the key at a point infinitely far in the future. Hint:P∞(x) =Px0P (Xnext day= x | Xcurrent day= x0)P∞(x0)Every evening the GSI who has the key feels obliged to write a short anonymous report on their opinion about thestate of AI.
View Full Document