DOC PREVIEW
MIT 6 01 - Study Notes

This preview shows page 1-2-3-4-5 out of 15 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 15 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

6.01, Spring Semester, 2008—Course notes for weeks 10 and 11 1MASSACHVSETTS INSTITVTE OF TECHNOLOGYDepartment of Electrical Engineering and Computer Science6.01—Introduction to EECS ISpring Semester, 2008Course notes for weeks 10 and 11Long-term decision-making and searchLong-term decision-makingSo far in this course, we have taken a couple of different approaches to action selection. In thenon-deterministic behaviors section, we considered a sets of actions that would satisfy differentrequirements, and chose an action that satisfied as many requirements as possible. When we werestudying control loops, we didn’t have the program explicitly consider alternative courses of action,but we, as designers, made models of the controller’s behavior in the world and tried to provewhether it would behave in the way we wanted it to, taking into account a longer-term pattern ofbehavior.Often, we will want a system to generate complex long-term patterns of behavior, but we will notbe able to write a simple control rule to generate those behavior patterns. In that case, we’d likethe system to evaluate alternatives for itself, but instead of evaluating single actions, it will have toevaluate whole sequences of actions, deciding whether they’re a good thing to do given the currentstate of the world.Let’s think of the problem of navigating through a city, given a road map, and knowing where weare. We can’t usually decide whether to turn left at the next intersection without deciding on awhole path.As always, the first step in the process will be to come up with a formal model of a real-worldproblem that abstracts away the irrelevant detail. So, what, exactly, is a path? The car we’redriving will actually follow a trajectory through continuous space(time), but if we tried to plan atthat level of detail we would fail miserably. Why? First, because the space of possible trajectoriesthrough two-dimensional space is just too enormous. Second, because when we’re trying to decidewhich roads to take, we don’t have the information about where the other cars will be on thoseroads, which will end up having a huge effect on the detailed trajectory we’ll end up taking.So, we can divide the problem into two levels: planning in advance which turns we’ll make at whichintersections, but deciding dynamically exactly how to control the steering wheel and the gas tobest move from intersection to intersection, given the current circumstances (other cars, stop-lights,etc.).We can make an abstraction of the driving problem to include road intersections and the waythey’re connected by roads. Then, given a start and a goal intersection, we could consider allpossible paths between them, and choose the one that is best.What criteria might we use to evaluate a path? There are all sorts of reasonable ones: distance,time, gas mileage, traffic-related frustration, scenery, etc. Generally speaking, the approach we’lloutline below can be extended to any criterion that is additive: that is, your happiness with thewhole path is the sum of your happiness with each of the segments. For now, though, we’ll adopt thesimple criterion of wanting to find a path with the fewest “steps” from intersection to intersection.6.01, Spring Semester, 2008—Course notes for weeks 10 and 11 2SHBGFDACEsasbadaccfdfbdbedhfghgehFigure 1: Map of small city 1Now, one algorithm for deciding on the best path through a map of the road intersections in our(very small) world (shown in figure 1), would be to enumerate all the paths, evaluate each oneaccording to our criterion, and then return the best one. The problem is that there are lots ofpaths. Even in our little domain, with 9 intersections, there are 210 paths from the intersectionlabeled S to the one labeled G.We can get a much better handle on this problem, by formulating it as an instance of a graph search(or a “state-space search”) problem, for which there are simple algorithms that perform well.State-space searchWe’ll model a state-space search problem formally as consisting of• a set (possibly infinite) of states the system can be in• a starting state which is an element of the set of states• a goal test, which is a function that can be applied to any state, and returns True if that statecan serve as a goal1• a successor function, which is a function that can be applied to a state, yielding a set of action-state pairs; the actions name the different choices that can be made in the state, and each hasassociated with it the new state that will result from taking the action in the current stateThe decision about what constitutes an action is a modeling decision. It could be to drive to thenext intersection, or to drive a meter, or a variety of other things, depending on the domain. Theonly requirement is that it terminate in a well-defined next state (and that, when it is time toexecute the plan, we will know how to to execute the action.)We can think of this model as specifying a labeled graph (in the computer scientist’s sense), inwhich the states are the nodes and the successor function specifies what arcs exist between thenodes (where the labels are the names of the actions).1Although in many cases we have a particular goal state (such as the intersection in front of my house), in othercases, we may have the goal of going to any gas station, which can be satisfied by many different intersections.6.01, Spring Semester, 2008—Course notes for weeks 10 and 11 3SBFDAC SS EDA A B F H A BBA F H B H A BsasbacadsaaccfadbddfdhsasbbdadbddfdhbeehbebssasbFigure 2: Partially expanded search tree for city 1So, for the little world in figure 1, we might make a model in which• The set of states is the intersections {S, A, B, C, D, E, F, G, H}.• The starting state is S.• The goal test isdef g(x ):return x == ’E ’• The successor function is defined using a dictionary:map1 = { ’S ’ : [( ’sa ’ , ’A ’) , (’sb ’ , ’ B ’)] ,’A ’ : [( ’sa ’ , ’S ’) , (’ac ’ , ’C ’) , (’ad ’ , ’ D ’)] ,’B ’ : [( ’sb ’ , ’S ’) , (’bd ’ , ’D ’) , (’be ’ , ’ E ’)] ,’C ’ : [( ’ac ’ , ’A ’) , (’cf ’ , ’F ’)] ,’D ’ : [( ’ad ’ , ’A ’) , (’bd ’ , ’B ’) , (’df ’ , ’ F ’), (’dh ’, ’H ’)] ,’E ’ : [( ’be ’ , ’B ’) , (’eh ’ , ’H ’)] ,’F ’ : [( ’cf ’ , ’C ’) , (’df ’ , ’D ’) , (’fg ’ , ’ G ’)] ,’H ’ : [(


View Full Document

MIT 6 01 - Study Notes

Documents in this Course
Week 1

Week 1

3 pages

Op-Amps

Op-Amps

8 pages

Op-Amps

Op-Amps

6 pages

Syllabus

Syllabus

14 pages

Planning

Planning

14 pages

Load more
Download Study Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Study Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Study Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?