CS421: Intro to AI1 Hal Daumé III ([email protected])Adversarial SearchHal Daumé IIIComputer ScienceUniversity of [email protected] 421: Introduction to Artificial Intelligence9 Feb 2012Many slides courtesy ofDan Klein, Stuart Russell,or Andrew MooreCS421: Intro to AI2 Hal Daumé III ([email protected]) Announcements➢NoneCS421: Intro to AI3 Hal Daumé III ([email protected])Adversarial Search[DEMO: mystery pacman]CS421: Intro to AI4 Hal Daumé III ([email protected])Game Playing State-of-the-Art➢Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in 1994. Used an endgame database defining perfect play for all positions involving 8 or fewer pieces on the board, a total of 443,748,401,247 positions. Checkers is now solved!➢Chess: Deep Blue defeated human world champion Gary Kasparov in a six-game match in 1997. Deep Blue examined 200 million positions per second, used very sophisticated evaluation and undisclosed methods for extending some lines of search up to 40 ply.➢Othello: human champions refuse to compete against computers, which are too good.➢Go: human champions refuse to compete against computers, which are too bad. In go, b > 300, so most programs use pattern knowledge bases to suggest plausible moves.➢Pacman: unknownCS421: Intro to AI5 Hal Daumé III ([email protected])GamesCraftershttp://gamescrafters.berkeley.edu/CS421: Intro to AI6 Hal Daumé III ([email protected])Game Playing➢Many different kinds of games!➢Axes:➢Deterministic or stochastic?➢One, two or more players?➢Perfect information (can you see the state)?➢Want algorithms for calculating a strategy (policy) which recommends a move in each state Examples? Deterministic, 1 player, perfect information? Deterministic, 1 player, imperfect information? Deterministic, >1 player, perfect information? Deterministic, >1 player, imperfect information? Stochastic, 1 player, perfect information? Stochastic, 1 player, imperfect information? Stochastic, >1 player, perfect information? Stochastic, >1 player, imperfect information? http://u.hal3.name/ic.pl?q=gameCS421: Intro to AI7 Hal Daumé III ([email protected])Deterministic Games➢Many possible formalizations, one is:➢States: S (start at s0)➢Players: P={1...N} (usually take turns)➢Actions: A (may depend on player / state)➢Transition Function: SxA → S➢Terminal Test: S → {t,f}➢Terminal Utilities: SxP → R➢Solution for a player is a policy: S → ACS421: Intro to AI8 Hal Daumé III ([email protected])Deterministic Single-Player?➢Deterministic, single player, perfect information:➢Know the rules➢Know what actions do➢Know when you win➢E.g. Freecell, 8-Puzzle, Rubik’s cube➢… it’s just search!➢Slight reinterpretation:➢Each node stores a value: the best outcome it can reach➢This is the maximal outcome of its children➢Note that we don’t have path sums as before (utilities at end)➢After search, can pick move that leads to best nodewin loseloseCS421: Intro to AI9 Hal Daumé III ([email protected])Deterministic Two-Player➢E.g. tic-tac-toe, chess, checkers➢Minimax search➢A state-space search tree➢Players alternate➢Each layer, or ply, consists of a round of moves➢Choose move to position with highest minimax value = best achievable utility against best play➢Zero-sum games➢One player maximizes result➢The other minimizes result8 2 5 6maxminCS421: Intro to AI10 Hal Daumé III ([email protected])Tic-tac-toe Game TreeCS421: Intro to AI11 Hal Daumé III ([email protected])Minimax ExampleCS421: Intro to AI12 Hal Daumé III ([email protected])Minimax SearchCS421: Intro to AI13 Hal Daumé III ([email protected])Minimax Properties➢Optimal against a perfect player. Otherwise?➢Time complexity?➢O(bm)➢Space complexity?➢O(bm)➢For chess, b ≈ 35, m ≈ 100➢Exact solution is completely infeasible➢But, do we need to explore the whole tree?10 10 9 100maxminCS421: Intro to AI14 Hal Daumé III ([email protected])Resource Limits➢Cannot search to leaves➢Depth-limited search➢Instead, search a limited depth of tree➢Replace terminal utilities with an eval function for non-terminal positions➢Guarantee of optimal play is gone➢More plies makes a BIG difference➢[DEMO: limitedDepth]➢Example:➢Suppose we have 100 seconds, can explore 10K nodes / sec➢So can check 1M nodes per move➢α-β reaches about depth 8 – decent chess program? ? ? ?-1 -2 4 94min minmax-2 4CS421: Intro to AI15 Hal Daumé III ([email protected])Evaluation Functions➢Function which scores non-terminals➢Ideal function: returns the utility of the position➢In practice: typically weighted linear sum of features:➢e.g. f1(s) = (num white queens – num black queens), etc.CS421: Intro to AI16 Hal Daumé III ([email protected])Evaluation for Pacman[DEMO: thrashing, smart ghosts]CS421: Intro to AI17 Hal Daumé III ([email protected])Iterative DeepeningIterative deepening uses DFS as a subroutine:1. Do a DFS which only searches for paths of length 1 or less. (DFS gives up on any path of length 2)2. If “1” failed, do a DFS which only searches paths of length 2 or less.3. If “2” failed, do a DFS which only searches paths of length 3 or less.….and so on.This works for single-agent search as well!Why do we want to do this for multiplayer games?…bCS421: Intro to AI18 Hal Daumé III ([email protected])Pruning in Minimax Search[-∞,+∞]3 12 8 2 14 5 2[-∞,3] [-∞,2] [-∞,14][3,3][-∞,5][2,2][3,+∞][3,14][3,5][3,3]CS421: Intro to AI19 Hal Daumé III ([email protected])α-β Pruning ExampleCS421: Intro to AI20 Hal Daumé III ([email protected])α-β Pruning➢General configuration➢α is the best value that MAX can get at any choice point along the current path➢If n becomes worse than α, MAX will avoid it, so can stop considering n’s other children➢Define β similarly for MINPlayerOpponentPlayerOpponentαnCS421: Intro to AI21 Hal Daumé III ([email protected])α-β Pruning PseudocodeβvCS421: Intro to AI22 Hal Daumé III ([email protected])α-β Pruning Properties➢Pruning has no effect on final result➢Good move ordering improves effectiveness of pruning➢With “perfect ordering”:➢Time complexity drops to O(bm/2)➢Doubles solvable depth➢Full search of, e.g. chess, is still hopeless!➢A simple example of metareasoning, here reasoning about which computations are relevantCS421: Intro to AI23 Hal Daumé III ([email protected])Non-Zero-Sum Games➢Similar to minimax:➢Utilities are now tuples➢Each player maximizes their own entry at each
View Full Document