CS 188: Artificial Intelligence Fall 2008AnnouncementsLocal SearchHill ClimbingHill Climbing DiagramSimulated AnnealingSlide 7Beam SearchGenetic AlgorithmsExample: N-QueensAdversarial SearchGame Playing State-of-the-ArtGamesCraftersGame PlayingDeterministic GamesDeterministic Single-Player?Deterministic Two-PlayerTic-tac-toe Game TreeMinimax ExampleMinimax SearchMinimax PropertiesResource LimitsEvaluation FunctionsEvaluation for PacmanIterative Deepening- Pruning Example- Pruning- Pruning Pseudocode- Pruning PropertiesNon-Zero-Sum GamesStochastic Single-PlayerStochastic Two-PlayerSlide 34What’s Next?CS 188: Artificial IntelligenceFall 2008Lecture 6: Adversarial Search9/16/2008Dan Klein – UC BerkeleyMany slides over the course adapted from either Stuart Russell or Andrew Moore1AnnouncementsProject 2 is up (Multi-Agent Pacman)Other annoucements:No more Friday project deadlines (makes it hard to use slip days)After this week, we’ll use section and drop box for written assignments rather than lectureSanity checker issues – informal pollLooking for partners? Workload balanced for pairs2Local SearchQueue-based algorithms keep fallback options (backtracking)Local search: improve what you have until you can’t make it betterGenerally much more efficient (but incomplete)3Hill ClimbingSimple, general idea:Start whereverAlways choose the best neighborIf no neighbors have better scores than current, quitWhy can this be a terrible idea?Complete?Optimal?What’s good about it?4Hill Climbing DiagramRandom restarts?Random sideways steps?5Simulated AnnealingIdea: Escape local maxima by allowing downhill movesBut make them rarer as time goes on6Simulated AnnealingTheoretical guarantee:Stationary distribution:If T decreased slowly enough,will converge to optimal state!Is this an interesting guarantee?Sounds like magic, but reality is reality:The more downhill steps you need to escape, the less likely you are to every make them all in a rowPeople think hard about ridge operators which let you jump around the space in better ways7Beam SearchLike hill-climbing search, but keep K states at all times:Variables: beam size, encourage diversity?The best choice in MANY practical settingsComplete? Optimal?Why do we still need optimal methods?Greedy Search Beam Search8Genetic AlgorithmsGenetic algorithms use a natural selection metaphorLike beam search (selection), but also have pairwise crossover operators, with optional mutationProbably the most misunderstood, misapplied (and even maligned) technique around!9Example: N-QueensWhy does crossover make sense here?When wouldn’t it make sense?What would mutation be?What would a good fitness function be?10Adversarial Search[DEMO: mystery pacman]11Game Playing State-of-the-ArtCheckers: Chinook ended 40-year-reign of human world champion Marion Tinsley in 1994. Used an endgame database defining perfect play for all positions involving 8 or fewer pieces on the board, a total of 443,748,401,247 positions. Checkers is now solved!Chess: Deep Blue defeated human world champion Gary Kasparov in a six-game match in 1997. Deep Blue examined 200 million positions per second, used very sophisticated evaluation and undisclosed methods for extending some lines of search up to 40 ply.Othello: human champions refuse to compete against computers, which are too good.Go: human champions refuse to compete against computers, which are too bad. In go, b > 300, so most programs use pattern knowledge bases to suggest plausible moves.Pacman: unknown12GamesCraftershttp://gamescrafters.berkeley.edu/13Game PlayingMany different kinds of games!Axes:Deterministic or stochastic?One, two or more players?Perfect information (can you see the state)?Want algorithms for calculating a strategy (policy) which recommends a move in each state14Deterministic GamesMany possible formalizations, one is:States: S (start at s0)Players: P={1...N} (usually take turns)Actions: A (may depend on player / state)Transition Function: SxA STerminal Test: S {t,f}Terminal Utilities: SxP RSolution for a player is a policy: S A15Deterministic Single-Player?Deterministic, single player, perfect information:Know the rulesKnow what actions doKnow when you winE.g. Freecell, 8-Puzzle, Rubik’s cube… it’s just search!Slight reinterpretation:Each node stores a value: the best outcome it can reachThis is the maximal outcome of its childrenNote that we don’t have path sums as before (utilities at end)After search, can pick move that leads to best nodewin loselose16Deterministic Two-PlayerE.g. tic-tac-toe, chess, checkersMinimax searchA state-space search treePlayers alternateEach layer, or ply, consists of a round of movesChoose move to position with highest minimax value = best achievable utility against best playZero-sum gamesOne player maximizes resultThe other minimizes result8 2 5 6maxmin17Tic-tac-toe Game Tree18Minimax Example19Minimax Search20Minimax PropertiesOptimal against a perfect player. Otherwise?Time complexity?O(bm)Space complexity?O(bm)For chess, b 35, m 100Exact solution is completely infeasibleBut, do we need to explore the whole tree?10 10 9 100maxmin[DEMO: minVsExp]21Resource LimitsCannot search to leavesDepth-limited searchInstead, search a limited depth of treeReplace terminal utilities with an eval function for non-terminal positionsGuarantee of optimal play is goneMore plies makes a BIG difference[DEMO: limitedDepth]Example:Suppose we have 100 seconds, can explore 10K nodes / secSo can check 1M nodes per move- reaches about depth 8 – decent chess program? ? ? ?-1 -2 4 94min minmax-2 422Evaluation FunctionsFunction which scores non-terminalsIdeal function: returns the utility of the positionIn practice: typically weighted linear sum of features:e.g. f1(s) = (num white queens – num black queens), etc.23Evaluation for Pacman[DEMO: thrashing, smart ghosts]24Iterative DeepeningIterative deepening uses DFS as a subroutine:1. Do a DFS which only searches for paths of length 1 or less. (DFS gives up on any path of length 2)2. If “1” failed, do a DFS which only searches paths of length 2 or less.3. If “2” failed, do a DFS
View Full Document