CS 188: Artificial Intelligence Fall 2008Recap: Resource LimitsEvaluation for PacmanIterative Deepening- Pruning Example- Pruning- Pruning Pseudocode- Pruning PropertiesExpectimax Search TreesMaximum Expected UtilityReminder: ProbabilitiesWhat are Probabilities?Uncertainty EverywhereExpectationsUtilitiesExpectimax SearchExpectimax PseudocodeExpectimax for PacmanExpectimax Pruning?Expectimax EvaluationMixed Layer TypesStochastic Two-PlayerNon-Zero-Sum GamesPreferencesRational PreferencesSlide 27MEU PrincipleHuman UtilitiesUtility ScalesExample: InsuranceMoneyExample: Human Rationality?CS 188: Artificial IntelligenceFall 2008Lecture 7: Expectimax Search9/18/2008Dan Klein – UC BerkeleyMany slides over the course adapted from either Stuart Russell or Andrew Moore1Recap: Resource LimitsCannot search to leavesDepth-limited searchInstead, search a limited depth of treeReplace terminal utilities with an eval function for non-terminal positionsGuarantee of optimal play is goneReplanning agents:Search to choose next actionReplan each new turn in response to new state? ? ? ?-1 -2 4 94min minmax-2 42Evaluation for Pacman[DEMO: thrashing, smart ghosts]3Iterative DeepeningIterative deepening uses DFS as a subroutine:1. Do a DFS which only searches for paths of length 1 or less. (DFS gives up on any path of length 2)2. If “1” failed, do a DFS which only searches paths of length 2 or less.3. If “2” failed, do a DFS which only searches paths of length 3 or less.….and so on.This works for single-agent search as well!Why do we want to do this for multiplayer games?…b4- Pruning Example5- PruningGeneral configuration is the best value that MAX can get at any choice point along the current pathIf n becomes worse than , MAX will avoid it, so can stop considering n’s other childrenDefine similarly for MINPlayerOpponentPlayerOpponentn6- Pruning Pseudocodev7- Pruning PropertiesPruning has no effect on final resultGood move ordering improves effectiveness of pruningWith “perfect ordering”:Time complexity drops to O(bm/2)Doubles solvable depthFull search of, e.g. chess, is still hopeless!A simple example of metareasoning, here reasoning about which computations are relevant8Expectimax Search TreesWhat if we don’t know what the result of an action will be? E.g.,In solitaire, next card is unknownIn minesweeper, mine locationsIn pacman, the ghosts act randomlyCan do expectimax searchChance nodes, like min nodes, except the outcome is uncertainCalculate expected utilitiesMax nodes as in minimax searchChance nodes take average (expectation) of value of childrenLater, we’ll learn how to formalize the underlying problem as a Markov Decision Process10 4 5 7maxchance[DEMO: minVsExp]9Maximum Expected UtilityWhy should we average utilities? Why not minimax?Principle of maximum expected utility: an agent should chose the action which maximizes its expected utility, given its knowledgeGeneral principle for decision makingOften taken as the definition of rationalityWe’ll see this idea over and over in this course!Let’s decompress this definition…10Reminder: ProbabilitiesA random variable represents an event whose outcome is unknownA probability distribution is an assignment of weights to outcomesExample: traffic on freeway?Random variable: T = whether there’s trafficOutcomes: T in {none, light, heavy}Distribution: P(T=none) = 0.25, P(T=light) = 0.55, P(T=heavy) = 0.20Some laws of probability (more later):Probabilities are always non-negativeProbabilities over all possible outcomes sum to oneAs we get more evidence, probabilities may change:P(T=heavy) = 0.20, P(T=heavy | Hour=8am) = 0.60We’ll talk about methods for reasoning and updating probabilities later11What are Probabilities?Objectivist / frequentist answer:Averages over repeated experimentsE.g. empirically estimating P(rain) from historical observationAssertion about how future experiments will go (in the limit)New evidence changes the reference classMakes one think of inherently random events, like rolling diceSubjectivist / Bayesian answer:Degrees of belief about unobserved variablesE.g. an agent’s belief that it’s raining, given the temperatureE.g. pacman’s belief that the ghost will turn left, given the stateOften learn probabilities from past experiences (more later)New evidence updates beliefs (more later)12Uncertainty EverywhereNot just for games of chance!I’m snuffling: am I sick?Email contains “FREE!”: is it spam?Tooth hurts: have cavity?60 min enough to get to the airport?Robot rotated wheel three times, how far did it advance?Safe to cross street? (Look both ways!)Why can a random variable have uncertainty?Inherently random process (dice, etc)Insufficient or weak evidenceIgnorance of underlying processesUnmodeled variablesThe world’s just noisy!Compare to fuzzy logic, which has degrees of truth, or rather than just degrees of belief13ExpectationsReal valued functions of random variables:Expectation of a function of a random variableExample: Expected value of a fair die rollXPf1 1/6 12 1/6 23 1/6 34 1/6 45 1/6 56 1/6 615UtilitiesUtilities are functions from outcomes (states of the world) to real numbers that describe an agent’s preferencesWhere do utilities come from?In a game, may be simple (+1/-1)Utilities summarize the agent’s goalsTheorem: any set of preferences between outcomes can be summarized as a utility function (provided the preferences meet certain conditions)In general, we hard-wire utilities and let actions emerge (why don’t we let agents decide their own utilities?)More on utilities soon…16Expectimax SearchIn expectimax search, we have a probabilistic model of how the opponent (or environment) will behave in any stateModel could be a simple uniform distribution (roll a die)Model could be sophisticated and require a great deal of computationWe have a node for every outcome out of our control: opponent or environmentThe model might say that adversarial actions are likely!For now, assume for any state we magically have a distribution to assign probabilities to opponent actions / environment outcomesHaving a probabilistic belief about an agent’s action does
View Full Document