Adversarial Search and Game PlayingTopicsWhat & WhyGame-Playing AgentRelation of Games to SearchTypes of gamesSlide 7Typical caseNimNIM(2,2)Game Search FormulationPartial Game Tree for Tic-Tac-ToeGame Tree for NIM(2,2)Optimal strategiesOptimal PlayMinimax TreeTwo-Ply Game TreeSlide 18Slide 19Slide 20Min-Max Game Tree for NIM(2,2)What if MIN does not play optimally?Minimax AlgorithmProperties of MinimaxSlide 26Problem of minimax searchAlpha-beta pruningAlpha-Beta ExampleAlpha-Beta Example (continued)Slide 31Slide 32Slide 33Slide 34Slide 35Slide 36Slide 37Alpha-Beta AlgorithmSlide 39General alpha-beta pruningSlide 41Slide 42Comments: Alpha-Beta PruningImperfect Real-time DecisionsCutting off searchEvaluation functionEvaluation function examplesIssuesAdaptive SearchGames that include chanceSlide 51Slide 52Slide 53Expected minimax valueSome examples….CheckersChinook vs. TinsleyChinookBackgammonChessMan vs. MachineSlide 63Several versions of the story…Reversi/OthelloOthelloGo: On the One SideGo: And on the OtherPerspective on Games: ProPerspective on Games: ConSummaryAdditional ResourcesAdversarial Search and Adversarial Search and Game PlayingGame PlayingRussell and Norvig: Chapter 6CMSC 421 – Fall 2006TopicsGame playingGame treesMinimaxAlpha-beta pruningExamplesWhat & WhyGames are a form of multi-agent environmentWhat do other agents do and how do they affect our success?Cooperative vs. competitive Competitive multi-agent environments give rise to adversarial search a.k.a. g amesWhy study games?Fun; historically entertainingInteresting subject of study because they are hardEasy to represent and agents restricted to small number of actionsGame-Playing AgentGame-Playing Agentenvironmentagent?sensorsactuatorsEnvironmentRelation of Games to SearchSearch – no adversarySolution is (heuristic) method for finding goalHeuristics and CSP techniques can find optimal solutionEvaluation function: estimate of cost from start to goal through given nodeExamples: path planning, scheduling activitiesGames – adversarySolution is strategy (strategy specifies move for every possible opponent reply).Time limits force an approximate solutionEvaluation function: evaluate “goodness” of game positionExamples: chess, checkers, Othello, backgammonTypes of gamesperfect informationimperfect informationdeterministicchanceTypes of gameschess, checkers, go, othellobackgammon, monopolybridge, poker, scrabbleperfect informationimperfect informationdeterministicchanceTypical case2-person gamePlayers alternate moves Zero-sum: one player’s loss is the other’s gainPerfect information: both players have access to complete information about the state of the game. No information is hidden from either player.No chance (e.g., using dice) involved Examples: Tic-Tac-Toe, Checkers, Chess, Go, Nim, OthelloNot: Bridge, Solitaire, Backgammon, ...NimGame: start with a punch of piles of tokensAt each turn, a player choose one of the piles, and may pick up one or two tokens from the pileThe player who picks up the last token LOSESNIM(2,2)Try the following case:pile 1 pile 2Game Search FormulationGame Search Formulation Two players MAX and MIN take turns (with MAX playing first) State space Initial state Successor function Terminal test Utility function, that tells whether a terminal state is a win (for MAX), a loss, or a draw MAX uses search tree to determine next move.Partial Game Tree for Tic-Tac-ToeGame Tree for NIM(2,2)Optimal strategiesFind the contingent strategy for MAX assuming an infallible MIN opponent.Assumption: Both players play optimally !!Given a game tree, the optimal strategy can be determined by using the minimax value of each node:MINIMAX-VALUE(n)=UTILITY(n) If n is a terminalmaxs successors(n) MINIMAX-VALUE(s) If n is a max nodemins successors(n) MINIMAX-VALUE(s) If n is a min nodeOptimal PlayMAXMINThis is the optimal play2 7 182 7 1822 7 182 12 7 182 122 7 18Minimax TreeMAX nodeMIN nodef valuevalue computed by minimaxTwo-Ply Game TreeTwo-Ply Game TreeTwo-Ply Game TreeTwo-Ply Game TreeThe minimax decisionMinimax maximizes the worst-case outcome for max.Min-Max Game Tree for NIM(2,2)What if MIN does not play optimally?Definition of optimal play for MAX assumes MIN plays optimally: maximizes worst-case outcome for MAX.But if MIN does not play optimally, MAX can do even better.Minimax Algorithmfunction MINIMAX-DECISION(state) returns an action inputs: state, current state in game vMAX-VALUE(state) return the action in SUCCESSORS(state) with value vfunction MIN-VALUE(state) returns a utility value if TERMINAL-TEST(state) then return UTILITY(state) v ∞ for a,s in SUCCESSORS(state) do v MIN(v,MAX-VALUE(s)) return vfunction MAX-VALUE(state) returns a utility value if TERMINAL-TEST(state) then return UTILITY(state) v ∞ for a,s in SUCCESSORS(state) do v MAX(v,MIN-VALUE(s)) return vProperties of MinimaxCriterion MinimaxProperties of MinimaxCriterion MinimaxComplete?Optimal?TimeSpaceyes, O(bm), O(bm), yes, in theory…Problem of minimax searchNumber of games states is exponential to the number of moves.Solution: Do not examine every node ==> Alpha-beta pruningRemove branches that do not influence final decisionAlpha-beta pruningWe can improve on the performance of the minimax algorithm through alpha-beta pruningBasic idea: “If you have an idea that is surely bad, don't take the time to see how truly awful it is.” -- Pat Winston 2 7 1=2>=2<=1?•We don’t need to compute the value at this node.•No matter what it is, it can’t affect the value of the root node.MAXMAXMINAlpha-Beta Example[-∞, +∞][-∞,+∞]Range of possible valuesDo DF-search until first leafAlpha-Beta Example (continued)[-∞,3][-∞,+∞]Alpha-Beta Example (continued)[-∞,3][-∞,+∞]Alpha-Beta Example (continued)[3,+∞][3,3]Alpha-Beta Example (continued)[-∞,2][3,+∞][3,3]This node is worse for MAXAlpha-Beta Example (continued)[-∞,2][3,14][3,3] [-∞,14],Alpha-Beta Example (continued)[−∞,2][3,5][3,3] [-∞,5],Alpha-Beta Example (continued)[2,2][−∞,2][3,3][3,3]Alpha-Beta Example (continued)[2,2][-∞,2][3,3][3,3]Alpha-Beta Algorithmfunction ALPHA-BETA-SEARCH(state) returns an action inputs: state, current state in game vMAX-VALUE(state, - ∞ , +∞) return the action in SUCCESSORS(state) with value vfunction MAX-VALUE(state, , ) returns a utility value if TERMINAL-TEST(state)
View Full Document