6 01 Fall Semester 2007 Lecture 11 Notes 1 MASSACHVSETTS INSTITVTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science 6 01 Introduction to EECS I Fall Semester 2007 Lecture 11 Notes Long term decision making and search Long term decision making So far in this course we have taken a couple of different approaches to action selection In the utility functions section we considered a set of possible next actions evaluated the utility of the results and chose the one with the best outcome When we were studying control loops we didn t have the program explicitly consider alternative courses of action but we as designers made models of the controller s behavior in the world and tried to prove whether it would behave in the way we wanted it to taking into account a longer term pattern of behavior Often we will want a system to generate complex long term patterns of behavior but we will not be able to write a simple control rule to generate those behavior patterns In that case we d like the system to evaluate alternatives for itself but instead of evaluating single actions it will have to evaluate whole sequences of actions deciding whether they re a good thing to do given the current state of the world Let s think of the problem of navigating through a city given a road map and knowing where we are We can t usually decide whether to turn left at the next intersection without deciding on a whole path As always the first step in the process will be to come up with a formal model of a real world problem that abstracts away the irrelevant detail So what exactly is a path The car we re driving will actually follow a trajectory through continuous space time but if we tried to plan at that level of detail we would fail miserably Why First because the space of possible trajectories through two dimensional space is just too enormous Second because when we re trying to decide which roads to take we don t have the information about where the other cars will be on those roads which will end up having a huge effect on the detailed trajectory we ll end up taking So we can divide the problem into two levels planning in advance which turns we ll make at which intersections but deciding dynamically exactly how to control the steering wheel and the gas to best move from intersection to intersection given the current circumstances other cars stop lights etc We can make an abstraction of the driving problem to include road intersections and the way they re connected by roads Then given a start and a goal intersection we could consider all possible paths between them and choose the one that is best What criteria might we use to evaluate a path There are all sorts of reasonable ones distance time gas mileage traffic related frustration scenery etc Generally speaking the approach we ll outline below can be extended to any criterion that is additive that is your happiness with the whole path is the sum of your happiness with each of the segments For now though we ll adopt the simple criterion of wanting to find a path with the fewest steps from intersection to intersection 6 01 Fall Semester 2007 Lecture 11 Notes 2 S A C B E D F H G Figure 1 Map of small city 1 Now one algorithm for deciding on the best path through a map of the road intersections in our very small world shown in figure 1 would be to enumerate all the paths evaluate each one according to our criterion and then return the best one The problem is that there are lots of paths Even in our little domain with 9 intersections there are 210 paths from the intersection labeled S to the one labeled G We can get a much better handle on this problem by formulating it as an instance of a graph search or a state space search problem for which there are simple algorithms that perform well State space search We ll model a state space search problem formally as consisting of a set possibly infinite of states the system can be in a starting state which is an element of the set of states a goal test which is a function that can be applied to any state and returns True if that state can serve as a goal1 a successor function which is a function that can be applied to a state yielding a set of states that can be reached from the original state with one primitive step The decision about what constitutes a primitive step is a modeling decision It could be to drive to the next intersection or to drive a meter or a variety of other things depending on the domain The only requirement is that it terminate in a well defined next state and that when it is time to execute the plan we will know how to to execute the primitive step We can think of this model as specifying a graph in the computer scientist s sense in which the states are the nodes and the successor function specifies what arcs exist between the nodes So for the little world in figure 1 we might make a model in which 1 Although in many cases we have a particular goal state such as the intersection in front of my house in other cases we may have the goal of going to any gas station which can be satisfied by many different intersections 6 01 Fall Semester 2007 Lecture 11 Notes 3 S A C A F D A B B S F H A D B A E B F H S B H A B Figure 2 Partially expanded search tree for city 1 The set of states is the intersections S A B C D E F G H The starting state is S The goal test is def g x return x E The successor function is defined using a dictionary succ S A B A S C D B S D E C A F D A B F H E B H F C D G H D E G G F H def successors x return succ x We can think of this structure as defining a search tree as shown in figure 2 It has the starting state S at the root node and then each node has its successor states as children Layer k of this tree contains all possible paths through the graph of length k Some of these paths are completely ridiculous It can never be reasonable if we re trying to find the shortest path between two states to go back to a state we have previously visited on that same path So we can adopt the following rule Pruning rule 1 Don t consider any path that visits the same state twice This rule will let us remove a number of branches from the tree as shown in figure 3 This seems like a …
View Full Document