**Unformatted text preview:**

G1CDG2Dr. EickFundamentals of Artificial IntelligenceCOSC 4368Solution Sketches Midterm1 ExamMonday, March 4, 2019Name:Student id:Point Total (out of 54):Number Grade:1) Best-first Search and A* [10]a) Best-First-Search (using function h only) [3]b) A* (using f=g+h)[4]Dr. EickFundamentals of Artificial IntelligenceCOSC 4368Solution Sketches Midterm1 ExamMonday, March 4, 2019Name:Student id: 1. A* & Best-first Search (14 points)2. Evolutionary Computing (7 points)3. Reinforcement Learning (11 points)4. SA and Hill Climbing (9 points)5. Game Theory (6 points) 6. Games and Adversarial Search (7 points)Point Total (out of 54):Number Grade:The exam is “open books and notes”, but no computers and cell phones allowed; you have 75 minutes to complete the exam. Write all your answers on this document (you canuse back sides!).121) Best-first Search and A* [10]Consider the search space below, where S is the start node and G1 and G2 satisfy the goaltest. Arcs are labeled with the cost of traversing them and the estimated cost to a goal (the h function itself) is reported inside nodes.For each of the following search strategies, indicate which goal state is reached (if any) and list, in order, all the states popped off of the OPEN list. When all else is equal, nodes should be removed from OPEN in alphabetical order.a) Best-First-Search (using function h only) [3]Goal state reached: G2States popped off OPEN: (S, E, G2)b) A* (using f=g+h)[4]Goal state reached: G1 States popped off OPEN: (S, A, B, G1) 1 error: 1 point21721529 B 5A 910G1 0 E7 C8D4G2 0S25841473Problem 1 continuedb) Assume 2 admissible heuristics h1(s) and h2(s) are given for a search problem. Is h3(s)=max(h1(s),h2(s)) also admissible? Would you prefer using h2 or using h3 in conjunction with A*? Give reasons for your answers! [3] Yes, h3 is admissible. If h1 and h2 always underestimate the “true” cost then the max(h1,h2) will also underestimate the true cost; therefore, h3 is admissible. [1]I will prefer h3 is it dominates h2; that is: s h3(s)h2(s) as A* is more efficient (needs less node expansions) for functions h that are closer to the “true” cost. [2]c) Assume you run A* with a state evaluation function h that is precise; that is, it gives the exact cost h(s) of reaching a goal state from s. What can be said about the efficiency of A* in this case? Which nodes are expanded by A* when this particular evaluation function h is used? [4]Very efficient [1]; does not do any unnecessary state expansions as it only expands states along the optimal path from the initial state to a goalstate [3]2) Evolutionary Computing [7]a) Evolution computing relies on Darwinian evolution and survival of the fittest. What does this mean? How is the survival of the fittest in evolutionary computing search strategies accomplished—what do they do to simulate Darwinian evolution? [4]Fitter individuals have a higher probability to participate in the breeding of the next generation. [2]EC systems provide selection operators that select fitter individuals in the spirit of Darwinian Evolution; e.g. roulette wheel1 selection creates mating pools selecting individual fitness proportional: that is, an solutionthat is twice as fit as another solution will occur in the mating pool twice at much in comparison to the lower fitness solution. [2] b) What role do crossover2 operators play in evolutionary computing systems? [3]1 Mentioning tournament selection instead should also be fine!4The crossover operator creates permutations/different combinations of building blocks/properties found in the two parents [2]; it does not introduce anything new [1]; it is an exploitation operator [1] that tries to find the optimal combination of genetic material found in the current population.At most 3 points!3) Reinforcement Learning [11]Consider the following world called DEF World consisting of states 1, 2 and 3 and operators a, b and for visiting states 2 and 3 the agent receives a reward of 2 and 10, respectively. DEF Worlda) Apply temporal difference learning to the DEF World, depicted above, relying on the following assumptions: [6]- The agent starts in state 3 and applies aaaa (applies action ‘a’ 4 times)- is 1.0 and is 0.5- If state 1 is visited a reward of 0 is obtained- Utilities of the 3 states are initialized with 0What are the utilities of states 1, 2, 3 after ‘aaaa’ has been applied? Do not only give the final result but also how you derived the final result including the formulas you used! General Update Formula: U(s):=(1-)*U(s)+ *(R(s)+ *U(s’))U(3)=0+0.5*(10+1*0)=5U(2)=0+0.5*(2+1*0)=1U(1)=0+0.5*(0+1*5)=2.5U(3)=0.5*5+0.5*(10+1*1)=8One major error: 2 pointsTwo major errors: 0 points2 Also sometime called recombination operator in the literature.12 R=3 R=R=+10 ab/0.9ab/0.1ab5b) Give the Bellman equations for states 2 and 3 of the DEF world! [3]U(2)= 2 + *max (U(1), U(3)) [1]U(3)= 10 + *max (U(2), U(2)*0.1+U(1)*0.9) [2]No partial credit!c) What role does the parameter play when updating the utilities in temporal difference learning? [2]It determines how quickly our current beliefs/estimations are updated based on new evidence. 4) SA, and Hill Climbing [9]a) What role does temperature play when using Simulated Annealing (SA)? Why is the temperature quite high early when running SA but quite low near the end of the run of SA? [4]Temperature controls the degree of exploration and exploitation that occurs during the search. [1]At the beginning, high temperature allows for traversing non-promising states to avoid getting stuck in a “bad” local optimal hill. Towards the end, temperature gradually becomes lower, to focus on reaching the actual maximum of the currently explored hill rather than on switching from one hill to a more promising another hill. [3]b) Assume you apply randomized hill climbing to a maximization problem for a function f which has 4 local maxima. Will randomized hill climbing always find the global maximum? Give reasons for your answer! [3]No Randomized hill climbing will terminate at a local maximum, not necessarily the global maxima. [1]The result will depend upon the initial position where the search starts; hill climbing will terminate with the (local) maximum that is associated with this initial position[2]—as it terminates when it reaches a maximum and therefore is not able to go through the “next” valley that is associated with the “next” local

View Full Document