1CS 188: Artificial IntelligenceFall 2009Lecture 27: Conclusion12/3/2009Dan Klein – UC BerkeleyAI Applications22Pacman Contest Challenges: Long term strategy Multiple agents Adversarial utilities Uncertainty about other agents’ positions, plans, etc4Pacman Contest 60 submissions (over half the class!) Creative names: BalsamicVinegarOfJustice, BigUtilusMaximusAgents, … Creative methods: tracking, learning, search… 45 qualifiers (a third of the class!) Amazing work by everyone! Final Tournament Double elimination seeded by round-robin Required 15 CPUs for almost a day Final matches: now!53For Third Place6Seed 5: NarwhalAgentsKevin Brackbill and Tyler LatzkeProbabilistic tracking. Directed expectimax search from imputed game state using a linear feature function. MST for search, dead-end modeling. Really good at beating agents ranked ahead of them.Seed 4: DucksNils ReimersProbabilistic tracking. One offense agent that uses minimax in tactical situations. One defense agent that waits in the center of the board and predicts enemy crossing points.For First / Second Place7Seed 4: DucksNils ReimersProbabilistic tracking. One offense agent that uses minimax in tactical situations. One defense agent that waits in the center of the board and predicts enemy crossing points.Seed 2: BallerAgentsNick Fraioli and Haotian BaiBlitz! One takes the top, one takes the bottom. Ignores ghosts they can’t see, goes after enemies if convenient. Sets food targets with search. Who needs tracking?4But… Wait!8Seed 1: Chris Berner Zelam NgoCentral planner moves agents between offense reflex and defense reflex based on how the game is goingProbabilistic tracking with learned transition model for enemies (plan recognition), uses dots in addition to sonarQualifying AgentsBalanced offense / defense agents with basic tracking and avoidance.Seed 3: Andrea Goh (Skynet)Central planner that switches agent roles, assigns target dots, etc. Computes when going into dead ends is guaranteed to be safeScary AI from the futureResultsDouble Elimination 1stplace: Nils Reimer 2ndplace: Nick / Haotian 3rdplace: Kevin / TylerRound Robin 1stPlace: Chris Berner / Zelam Ngo 2ndPlace: Nick / Haotian 3rdPlace: Andrea Goh9Combined Results 1stplace:, Nils, Chris / Z 2ndplace: Nick / Haotian 3rdplace: Kevin / Tyler, Andrea… and congratulations to all!5…and Congratulations to All! Amazing work by everyone Record number of entries (60 teams) Record number of qualifications (45!) Lots of mutual support on newsgroup / office hours… You should all be proud of what you’ve accomplished!10Example: Stratagus[DEMO]6 Stratagus (similar to Starcraft, etc): example of a large RL task Stratagus is hard for reinforcement learning algorithms > 10100states > 1030actions at each point Time horizon ≈ 104 steps Stratagus is hard for human programmers Typically takes several person-months for game companies to write computer opponent Still, no match for experienced human players, very fragile Programming involves much trial and error Hierarchical RL Humans break up the task into a multi-level sketch using a partial program Learning algorithm fills in the detailsStratagus[From Bhaskara Marthi’s thesis, Berkeley]12(defun top ()(loop (choose (gather-wood)(gather-gold))))(defun gather-wood ()(with-choice (dest *forest-list*)(nav dest)(action ‘get-wood)(nav *base-loc*)(action ‘dropoff)))(defun gather-gold ()(with-choice (dest *goldmine-list*)(nav dest))(action ‘get-gold)(nav *base-loc*))(action ‘dropoff)))(defun nav (dest)(until (= (pos (get-state)) dest)(with-choice(move ‘(N S E W NOOP))(action move))))Solution: Hierarchical Learning137Hierarchical RL Solution: hierarchical planning and learning Define a hierarchical MDP Each level has Q-functions (one per choice) Learning happens at all levels, all at once State-of-the-art Still not very good at the strategic elements (high level strategy) Very good at balancing resources (mid-level allocation) Excellent at lowest levels of control[DEMO]14Pacman: Beyond Simulation?Students at Colorado University: http://pacman.elstonj.com[DEMO]8Bugman? AI = Animal Intelligence? Wim van Eck at Leiden University Pacman controlled by a human Ghosts controlled by crickets Vibrations drive crickets toward or away from Pacman’s locationhttp://pong.hku.nl/~wim/bugman.htm[DEMO]17Where to go next? Congratulations, you’ve seen the basics of modern AI … and done some amazing work putting it to use! How to continue: Robotics / vision / IR / language: cs189 Machine learning: cs281a / cs281b Cognitive modeling: cog sci 131 Vision: cs280 Robotic: cs287 NLP: cs288 Starcraft competition … and more; ask if you’re interested189That’s It! Help us out with some course evaluations Have a good break, and always maximize your expected
View Full Document