CMU CS 15780 - Lecture - D882841

Home> Schools> Carnegie Mellon University> Computer Science (CS) > CS 15780> Lecture

CMU CS 15780 - Lecture

Course Cs 15780- Graduate Artificial Intelligence

Pages 105

Download Save

Unformatted text preview:

15-780: Grad AILecture 15: PlanningGeoff Gordon (this lecture)Tuomas Sandholm TAs Erik Zawadzki, Abe OthmanReviewPlanning algorithms‣reduce to FOL (complications)‣or use subset of FOL (e.g., STRIPS)‣linear planner: add op to end of plan‣partial-order planner (operators, bindings, partial order, guards, open preconditions): resolve open precondSTRIPS: (world) state, operator = { preconditions } + { effects }, variable binding, goalsPlan GraphsPlanning & model searchFor a long time, it was thought that SAT-style model search was a non-starter as a planning algorithmMore recently, people have written fast planners that‣propositionalize the domain‣turn it into a CSP or SAT problem‣search for a modelPlan graphTool for making good CSPs: plan graphEncodes a subset of the constraints that plans must satisfyRemaining constraints are handled ‣during search (reject solutions that violate them)—needs special-purpose code‣or by adding extra clauses/constraintsExampleStart state: have(Cake)Goal: have(Cake) ∧ eaten(Cake)Operators: bake, eatEat‣pre: have(Cake)‣post: ¬have(Cake), eaten(Cake)Bake‣pre: ¬have(Cake)‣post: have(Cake)PropositionalizingNote: this domain is fully propositionalIf we had a general STRIPS domain, would have to pick a universe and propositionalizeE.g., eat(x) would become eat(Banana), eat(Cake), eat(Fred), …Plan graphAlternating levels: states and actionsFirst level: initial statehave¬eatenPlan graphFirst action level: all applicable actionsLinked to their preconditionshave¬eateneatPlan graphSecond state level: add effects of actions to get literals that could hold at step 2have¬eateneateaten¬havePlan graphAlso add maintenance actions to represent effect of doing nothinghave¬eateneathave¬eateneaten¬havePlan graphExtend another pair of levels: now bake is a possible actionhave¬eateneathave¬eateneaten¬haveeathave¬eateneaten¬havebakePlan graphCan extend as far right as we wantPlan = subset of the actions at each action levelOrdering unspecified within a levelPlan graphIn addition to the above links, add mutex links to indicate mutually exclusive actions or literalshave¬eateneathave¬eateneaten¬haveeathave¬eateneaten¬havebakePlan graphLiterals are mutex if they are contradictoryhave¬eateneathave¬eateneaten¬haveeathave¬eateneaten¬havebakePlan graphhave¬eateneathave¬eateneaten¬haveeathave¬eateneaten¬havebakeActions which assert contradictory literals are mutex (inconsistent effects)Plan graphLiterals are also mutex if there is no action or non-mutex pair of actions that could achieve both (inconsistent support)have¬eateneathave¬eateneaten¬haveeathave¬eateneaten¬havebakePlan graphActions are also mutex if one deletes a precondition of other (interference), or if preconditions are mutex (competition)have¬eateneathave¬eateneaten¬haveeathave¬eateneaten¬havebakeMutex summaryFor each action level, left to right, check pairs of actions A, B (each check linear in rep’n size):‣inconsistent effects: check each effect of A vs. effects of B‣interference: effects of A vs. preconds of B‣competing preconditions: check mutex links on preconditions of A, BResults at action level L tell us (in)consistent support at proposition level L+1Getting a planBuild the plan graph out to some length kSearch:‣directly on the graph‣or by translating to SAT or CSPIf search succeeds, read off the planIf not, increment k and try againThere is a test to see if k is “big enough”Plan searchDFS w/ variable ordering based on plan graphStart from last level, fill in last action set, compute necessary preconditions, fill in 2nd-to-last action set, etc.If at some level there is no way to do any actions, or no way to fill in consistent preconditions, backtrackPlan searchhave¬eateneathave¬eateneaten¬haveeathave¬eateneaten¬havebakePlan searchhave¬eateneathave¬eateneaten¬haveeathave¬eateneaten¬havebakePlan searchhave¬eateneathave¬eateneaten¬haveeathave¬eateneaten¬havebakePlan searchhave¬eateneathave¬eateneaten¬haveeathave¬eateneaten¬havebakePlan searchhave¬eateneathave¬eateneaten¬haveeathave¬eateneaten¬havebakePlan searchhave¬eateneathave¬eateneaten¬haveeathave¬eateneaten¬havebakePlan searchhave¬eateneathave¬eateneaten¬haveeathave¬eateneaten¬havebakeTranslation to SATOne variable for each pair of literals in state levelsOne variable per action in action levelsConstraints implement STRIPS semantics plus “hints”Solution tells us which actions are performed at each action level, which literals are true at each state levelAction constraintsEach action can only be executed if all of its preconditions are present:actt+1 ⇒ pre1t ∧ pre2t ∧ …If executed, action asserts its postconditions:actt+1 ⇒ post1t+2 ∧ post2t+2 ∧ …Literal constraintsIn order to achieve a literal, we must execute an action that achieves it‣postt+2 ⇒ act1t+1 ∨ act2t+1 ∨ …Might be a maintenance actionInitial & goal constraintsGoals must be satisfied at end: goal1T ∧ goal2T ∧ …And initial state holds at beginning:init11 ∧ init21 ∧ …Mutex constraintsMutex constraints between actions or literals: add clause (¬x ∨ ¬y)Mutexes are redundant, but help anywayTranslation to SAT: examplehave¬eateneathave¬eateneaten¬haveeathave¬eateneaten¬havebake1 2 3 4 5note: haven’t drawn all mutexes at levels 4 & 5Spatial PlanningPlans in Space…A* can be used for many thingsHere, A* for spatial planning (in contrast to, e.g., jobshop scheduling)Optimal Solution End-effector Trajectory Probability of Obstacle Appearing Probability of Obstacle AppearingSolution CostState ExpansionsFigure 10: Environment used in our second experiment, along with the optimal solution and the end-effector trajectory (withoutany dynamic obstacles). Also shown are the solution cost of the path traversed and the number of states expanded by each ofthe three algorithms compared.other words, by adding a fixed value to the key of each newstate placed on the queue, the old states are given a rela-tive advantage in their queue placement. When a state ispopped off the queue whose key value is not in line withthe current bias term, it is placed back on the queue with anupdated key value. The intuition is that only a small num-ber of the states previously on the queue may ever makeit to the top, so it can be much more efficient to only re-order the ones that do. We can use the same idea when �decreases (from �oto

View Full Document


School:
Email:
New Password:
Confirm Password:

CMU CS 15780 - Lecture

Sign up for free to view:

Please select your school