DOC PREVIEW
UNCC ITCS 3153 - Making Complex Decisions

This preview shows page 1-2-15-16-17-32-33 out of 33 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 33 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 33 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 33 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 33 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 33 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 33 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 33 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 33 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

ITCS 3153 Artificial IntelligenceRobot ExampleSimilar to 15-puzzle problemHow about other search techniquesMarkov decision processes (MDP)Building a policySlide 7Using a policyExample solutionsStriking a balanceAttributes of optimalityTime horizonEvaluating state sequencesEvaluating infinite horizonsSlide 15Evaluating a policyBuilding an optimal policyUtility of statesExampleRestating the policyPutting pieces togetherWhat a dealExample of Bellman EquationUsing Bellman Equations to solve MDPsIterative solution of Bellman equationsBellman UpdateConvergence of value iterationSlide 28Policy IterationPolicy iterationSlide 31Slide 32Slide 33ITCS 3153Artificial IntelligenceLecture 19Lecture 19Making Complex DecisionsMaking Complex DecisionsChapter 17Chapter 17Lecture 19Lecture 19Making Complex DecisionsMaking Complex DecisionsChapter 17Chapter 17Robot ExampleImagine a robot with only local sensingImagine a robot with only local sensing•Traveling from A to BTraveling from A to B•Actions have uncertainActions have uncertainresults – might move atresults – might move atright angle to desiredright angle to desired•We want robot to “learn”We want robot to “learn”how to navigate in this how to navigate in this roomroomSequential Decision ProblemSequential Decision ProblemImagine a robot with only local sensingImagine a robot with only local sensing•Traveling from A to BTraveling from A to B•Actions have uncertainActions have uncertainresults – might move atresults – might move atright angle to desiredright angle to desired•We want robot to “learn”We want robot to “learn”how to navigate in this how to navigate in this roomroomSequential Decision ProblemSequential Decision ProblemSimilar to 15-puzzle problemHow is this similar and different from 15-puzzle?How is this similar and different from 15-puzzle?•Let robot position beLet robot position bethe blank tilethe blank tile•Keep issuing movementKeep issuing movementcommandscommands•Eventually a sequenceEventually a sequenceof commands will causeof commands will causerobot to reach goalrobot to reach goalHow is this similar and different from 15-puzzle?How is this similar and different from 15-puzzle?•Let robot position beLet robot position bethe blank tilethe blank tile•Keep issuing movementKeep issuing movementcommandscommands•Eventually a sequenceEventually a sequenceof commands will causeof commands will causerobot to reach goalrobot to reach goalOur model of the world is incompleteOur model of the world is incompleteOur model of the world is incompleteOur model of the world is incompleteHow about other search techniquesGenetic AlgorithmsGenetic Algorithms•Let each “gene” be aLet each “gene” be asequence of L, R, U, Dsequence of L, R, U, D–Length unknownLength unknown–Poor feedbackPoor feedbackSimulated annealing?Simulated annealing?Genetic AlgorithmsGenetic Algorithms•Let each “gene” be aLet each “gene” be asequence of L, R, U, Dsequence of L, R, U, D–Length unknownLength unknown–Poor feedbackPoor feedbackSimulated annealing?Simulated annealing?Markov decision processes (MDP)Initial StateInitial State•SS00 Transition ModelTransition Model•T (s, a, s’)T (s, a, s’)–How does Markov apply here?How does Markov apply here?–Uncertainty is possibleUncertainty is possibleReward FunctionReward Function•R(s)R(s)–For each stateFor each stateInitial StateInitial State•SS00 Transition ModelTransition Model•T (s, a, s’)T (s, a, s’)–How does Markov apply here?How does Markov apply here?–Uncertainty is possibleUncertainty is possibleReward FunctionReward Function•R(s)R(s)–For each stateFor each stateBuilding a policyHow might we acquire and store a solution?How might we acquire and store a solution?•Is this a search problem?Is this a search problem?–Isn’t everything?Isn’t everything?•Avoid local minsAvoid local mins•Avoid dead endsAvoid dead ends•Avoid needless repetitionAvoid needless repetitionKey observation: Key observation: if the number of states is small, consider if the number of states is small, consider evaluating states rather than evaluating action sequencesevaluating states rather than evaluating action sequencesHow might we acquire and store a solution?How might we acquire and store a solution?•Is this a search problem?Is this a search problem?–Isn’t everything?Isn’t everything?•Avoid local minsAvoid local mins•Avoid dead endsAvoid dead ends•Avoid needless repetitionAvoid needless repetitionKey observation: Key observation: if the number of states is small, consider if the number of states is small, consider evaluating states rather than evaluating action sequencesevaluating states rather than evaluating action sequencesBuilding a policySpecify a solution for any initial stateSpecify a solution for any initial state•Construct a Construct a policypolicy that outputs the best action for any state that outputs the best action for any state–policy = policy = –policy in state s = policy in state s = (s)(s)•Complete policyComplete policy covers all potential input states covers all potential input states•Optimal policy, Optimal policy, *, yields the highest *, yields the highest expectedexpected utility utility–Why expected?Why expected?Transitions are stochasticTransitions are stochasticSpecify a solution for any initial stateSpecify a solution for any initial state•Construct a Construct a policypolicy that outputs the best action for any state that outputs the best action for any state–policy = policy = –policy in state s = policy in state s = (s)(s)•Complete policyComplete policy covers all potential input states covers all potential input states•Optimal policy, Optimal policy, *, yields the highest *, yields the highest expectedexpected utility utility–Why expected?Why expected?Transitions are stochasticTransitions are stochasticUsing a policyAn agent in state sAn agent in state s•s is the percept available to agents is the percept available to agent*(s) outputs an action that maximizes expected utility*(s) outputs an action that maximizes expected utilityThe policy is a description of a simple reflexThe policy is a description of a simple reflexAn agent in state sAn agent in state s•s is the percept available to agents is the percept available to agent*(s) outputs an action that maximizes expected utility*(s) outputs an


View Full Document

UNCC ITCS 3153 - Making Complex Decisions

Download Making Complex Decisions
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Making Complex Decisions and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Making Complex Decisions 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?