Berkeley COMPSCI 188 - Lecture 7: Expectimax Search - D2833686

Home> Schools> University of California, Berkeley> Computer Science (COMPSCI) > COMPSCI 188> Lecture 7: Expectimax Search

DOC PREVIEW

Berkeley COMPSCI 188 - Lecture 7: Expectimax Search

School name University of California, Berkeley

Course Compsci 188- Introduction to Artificial Intelligence

Pages 32

This preview shows page 1-2-15-16-31-32 out of 32 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 32 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 32 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 32 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 32 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 32 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 32 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 32 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

CS 188: Artificial Intelligence Fall 2008Recap: Resource LimitsEvaluation for PacmanIterative Deepening- Pruning Example- Pruning- Pruning Pseudocode- Pruning PropertiesExpectimax Search TreesMaximum Expected UtilityReminder: ProbabilitiesWhat are Probabilities?Uncertainty EverywhereExpectationsUtilitiesExpectimax SearchExpectimax PseudocodeExpectimax for PacmanExpectimax Pruning?Expectimax EvaluationMixed Layer TypesStochastic Two-PlayerNon-Zero-Sum GamesPreferencesRational PreferencesSlide 27MEU PrincipleHuman UtilitiesUtility ScalesExample: InsuranceMoneyExample: Human Rationality?CS 188: Artificial IntelligenceFall 2008Lecture 7: Expectimax Search9/18/2008Dan Klein – UC BerkeleyMany slides over the course adapted from either Stuart Russell or Andrew Moore1Recap: Resource LimitsCannot search to leavesDepth-limited searchInstead, search a limited depth of treeReplace terminal utilities with an eval function for non-terminal positionsGuarantee of optimal play is goneReplanning agents:Search to choose next actionReplan each new turn in response to new state? ? ? ?-1 -2 4 94min minmax-2 42Evaluation for Pacman[DEMO: thrashing, smart ghosts]3Iterative DeepeningIterative deepening uses DFS as a subroutine:1. Do a DFS which only searches for paths of length 1 or less. (DFS gives up on any path of length 2)2. If “1” failed, do a DFS which only searches paths of length 2 or less.3. If “2” failed, do a DFS which only searches paths of length 3 or less.….and so on.This works for single-agent search as well!Why do we want to do this for multiplayer games?…b4- Pruning Example5- PruningGeneral configuration is the best value that MAX can get at any choice point along the current pathIf n becomes worse than , MAX will avoid it, so can stop considering n’s other childrenDefine  similarly for MINPlayerOpponentPlayerOpponentn6- Pruning Pseudocodev7- Pruning PropertiesPruning has no effect on final resultGood move ordering improves effectiveness of pruningWith “perfect ordering”:Time complexity drops to O(bm/2)Doubles solvable depthFull search of, e.g. chess, is still hopeless!A simple example of metareasoning, here reasoning about which computations are relevant8Expectimax Search TreesWhat if we don’t know what the result of an action will be? E.g.,In solitaire, next card is unknownIn minesweeper, mine locationsIn pacman, the ghosts act randomlyCan do expectimax searchChance nodes, like min nodes, except the outcome is uncertainCalculate expected utilitiesMax nodes as in minimax searchChance nodes take average (expectation) of value of childrenLater, we’ll learn how to formalize the underlying problem as a Markov Decision Process10 4 5 7maxchance[DEMO: minVsExp]9Maximum Expected UtilityWhy should we average utilities? Why not minimax?Principle of maximum expected utility: an agent should chose the action which maximizes its expected utility, given its knowledgeGeneral principle for decision makingOften taken as the definition of rationalityWe’ll see this idea over and over in this course!Let’s decompress this definition…10Reminder: ProbabilitiesA random variable represents an event whose outcome is unknownA probability distribution is an assignment of weights to outcomesExample: traffic on freeway?Random variable: T = whether there’s trafficOutcomes: T in {none, light, heavy}Distribution: P(T=none) = 0.25, P(T=light) = 0.55, P(T=heavy) = 0.20Some laws of probability (more later):Probabilities are always non-negativeProbabilities over all possible outcomes sum to oneAs we get more evidence, probabilities may change:P(T=heavy) = 0.20, P(T=heavy | Hour=8am) = 0.60We’ll talk about methods for reasoning and updating probabilities later11What are Probabilities?Objectivist / frequentist answer:Averages over repeated experimentsE.g. empirically estimating P(rain) from historical observationAssertion about how future experiments will go (in the limit)New evidence changes the reference classMakes one think of inherently random events, like rolling diceSubjectivist / Bayesian answer:Degrees of belief about unobserved variablesE.g. an agent’s belief that it’s raining, given the temperatureE.g. pacman’s belief that the ghost will turn left, given the stateOften learn probabilities from past experiences (more later)New evidence updates beliefs (more later)12Uncertainty EverywhereNot just for games of chance!I’m snuffling: am I sick?Email contains “FREE!”: is it spam?Tooth hurts: have cavity?60 min enough to get to the airport?Robot rotated wheel three times, how far did it advance?Safe to cross street? (Look both ways!)Why can a random variable have uncertainty?Inherently random process (dice, etc)Insufficient or weak evidenceIgnorance of underlying processesUnmodeled variablesThe world’s just noisy!Compare to fuzzy logic, which has degrees of truth, or rather than just degrees of belief13ExpectationsReal valued functions of random variables:Expectation of a function of a random variableExample: Expected value of a fair die rollXPf1 1/6 12 1/6 23 1/6 34 1/6 45 1/6 56 1/6 615UtilitiesUtilities are functions from outcomes (states of the world) to real numbers that describe an agent’s preferencesWhere do utilities come from?In a game, may be simple (+1/-1)Utilities summarize the agent’s goalsTheorem: any set of preferences between outcomes can be summarized as a utility function (provided the preferences meet certain conditions)In general, we hard-wire utilities and let actions emerge (why don’t we let agents decide their own utilities?)More on utilities soon…16Expectimax SearchIn expectimax search, we have a probabilistic model of how the opponent (or environment) will behave in any stateModel could be a simple uniform distribution (roll a die)Model could be sophisticated and require a great deal of computationWe have a node for every outcome out of our control: opponent or environmentThe model might say that adversarial actions are likely!For now, assume for any state we magically have a distribution to assign probabilities to opponent actions / environment outcomesHaving a probabilistic belief about an agent’s action does

View Full Document