Final Exam Reminder CS 416 Artificial Intelligence Final Exam is Tuesday May 6th at 7 p m Let me know if you have a legitimate conflict Lecture 23 Making Complex Decisions Chapter 17 Zero sum games Optimal strategy Payoffs in each cell sum to zero Morra von Neumann 1928 developed optimal mixed strategy for two player zero sum games Two players Odd and Even Action Because what one player wins the other loses just keep track of one player s payoff in each cell Even assume this player wishes to maximize Each player simultaneously displays one or two fingers Evaluation Maximin technique f total number of fingers make game a turn taking game and analyze if f odd Even gives f dollars go to Odd if f even Odd gives f dollars go to Even Maximin Maximin Change the rules of Morra for analysis Change the rules of Morra for analysis Force Odd to reveal strategy first Force Even to reveal strategy first Apply minimax algorithm apply minimax algorithm Odd has an advantage and thus the outcome of the game is Even s worst case and Even might do better in real game Odd would always select one to minimize Odd s loss Even would always select one to maximize Even s gain This game favors Even The utility of this game to Even is 3 The utility of this game to Even is 2 Page 1 Combining two games Considering mixed strategies Mixed strategy Even s combined utility select one finger with prob p select two fingers with prob 1 p EvenFirst Utility Even s Utility OddFirst Utility If one player reveals strategy first second player will always use a pure strategy 3 Even s Utility 2 expected utility of a mixed strategy U1 p uone 1 p utwo expected utility of a pure strategy U2 max uone utwo U2 is always greater than U1 Modeling as a game tree What is outcome of this game Player Odd has a choice Because the second player will always use a fixed strategy Always pick the option that minimizes utility to Even Represent two choices as functions of p Odd picks line that is lowest dark part on figure Even maximizes utility by choosing p to be where lines cross 5p 3 4 7p p 7 12 Eutility 1 12 Still pretending Even goes first Pretend Odd must go first Final results Even s outcome decided by pure strategy dependent on q Both players use same mixed strategy pone 7 12 ptwo 5 12 Outcome of the game is 1 12 to Even Even will always pick maximum of two choices Odd will minimize the maximum of two choices Odd chooses intersection point 5q 3 4 7q q 7 12 Eutility 1 12 Page 2 Generalization Repeated games Two players with n action choices Imagine same game played multiple times mixed strategy is not as simple as p 1 p payoffs accumulate for each player optimal strategy is a function of game history it is p1 p2 pn 1 1 p1 p2 pn 1 Solving for optimal p vector requires finding optimal point in n1 dimensional space must select optimal action for each possible game history Strategies lines become hyperplanes some hyperplanes will be clearly worse for all p find intersection among remaining hyperplanes linear programming can solve this problem perpetual punishment cross me once and I ll take us both down forever tit for tat cross me once and I ll cross you the subsequent move The design of games Auctions Let s invert the strategy selection process to design fair effective games English Auction Tragedy of the commons individual farmers bring their livestock to the town commons to graze commons is destroyed and all experience negative utility all behaved rationally refraining would not have saved the commons as someone else would eat it auctioneer incrementally raises bid price until one bidder remains Externalities are a way to place a value on changes in global utility Power utilities pay for the utility they deprive neighboring communities yet another Nobel prize in Econ for this Coase bidder gets the item at the highest price of another bidder plus the increment perhaps the highest bidder would have spent more strategy is simple keep bidding until price is higher than utility strategy of other bidders is irrelevant Auctions Auctions Sealed bid auction Vickery Auction place your bid in an envelope and highest bid is selected Winner pays the price of the next highest bid Dominant strategy is to bid what item is worth to you say your highest bid is v say you believe the highest competing bid is b bid min v b player with highest value on good may not win the good and players must contemplate other player s values Page 3 Next Topic Statistical Learning Chapter 20 Auctions These auction algorithms can find their way into computer controlled systems Networking Routers Ethernet Thermostat control in offices Xerox PARC Running example Candy Urns and Balls Candy Bags Data and Hypotheses Maximum Likelihood Bayes Learning Expectation Maximization Hidden Markov Models HMMs Statistics Surprise Candy Given a bag of candy what distribution of flavors will it have Comes in two flavors cherry yum lime yuk Let H be the random variable corresponding to your hypothesis As you open pieces of candy let each observation of data D1 D2 D3 be either cherry or lime Predict the flavor of the next piece of candy All candy is wrapped in same opaque wrapper Candy is packaged in large bags containing five different allocations of cherry and lime Bayesian Learning Use available data to calculate the probability of each hypothesis and make a prediction Because each hypothesis has an independent likelihood we use all their relative likelihoods when making a prediction Probabilistic inference using Bayes rule P hi d P d hi P hi Prediction of an unknown quantity X P X d i P X d hi P hi d P X hi P hi d Page 4
View Full Document