DOC PREVIEW
Duke CPS 296.2 - Minimax strategies, Nash equilibria, correlated equilibria

This preview shows page 1-2-20-21 out of 21 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Minimax strategies, Nash equilibria, correlated equilibriaZero-sum games revisitedBest-response strategiesMinimax (minmax, maxmin) strategiesComputing a minimax strategy for rock-paper-scissorsMinimax theorem [von Neumann 1927]Solving for minimax strategies using linear programmingGeneral-sum gamesNash equilibrium [Nash 50]Nash equilibria of “chicken”Nash equilibria of “chicken”…The presentation gameThe “equilibrium selection problem”Some properties of Nash equilibriaHow hard is it to compute one (any) Nash equilibrium?What if we want to compute a Nash equilibrium with a specific property?Search-based approaches (for 2 players)Correlated equilibrium [Aumann 74]A correlated equilibrium for “chicken”A nonzero-sum variant of rock-paper-scissors (Shapley’s game [Shapley 64])Solving for a correlated equilibrium using linear programming (n players!)Minimax strategies, Nash equilibria, correlated equilibriaVincent Conitzer [email protected] games revisited0, 0 -1, 1 1, -11, -1 0, 0 -1, 1-1, 1 1, -1 0, 0•Recall: in a zero-sum game, payoffs in each entry sum to zero– … or to a constant: recall that we can subtract a constant from anyone’s utility function without affecting their behavior•What the one player gains, the other player losesBest-response strategies•Suppose you know your opponent’s mixed strategy–E.g. your opponent plays rock 50% of the time and scissors 50%•What is the best strategy for you to play?•Rock gives .5*0 + .5*1 = .5•Paper gives .5*1 + .5*(-1) = 0•Scissors gives .5*(-1) + .5*0 = -.5•So the best response to this opponent strategy is to (always) play rock•There is always some pure strategy that is a best response–Suppose you have a mixed strategy that is a best response; then every one of the pure strategies that that mixed strategy places positive probability on must also be a best responseMinimax (minmax, maxmin) strategies•Let us consider 2-player zero-sum games•Suppose that your opponent can see into your head and thus knows your mixed strategy•But your opponent does not know your random bits–E.g. your opponent knows that you play rock 50% of the time and scissors 50% of the time, but not which one you will actually happen to play this time–I.e. your opponent best-responds to your mixed strategy•What is the best that you (i) can do against such a powerful opponent (-i)?•maxσi mins-i ui(σi, s-i) (= - minσi maxs-i u-i(σi, s-i)) –Here σi is a mixed strategy, s-i is a pure strategy, and utility functions are extended to mixed strategies by taking the expectation of the utility over pure strategiesComputing a minimax strategy for rock-paper-scissors•Need to set: prock, ppaper, pscissors•Utility for other player of playing rock is pscissors - ppaper•Utility for other player of playing paper is prock - pscissors•Utility for other player of playing scissors is ppaper – prock•So, we want to minimize max{pscissors - ppaper, prock - pscissors, ppaper – prock}•Minimax strategy: prock = ppaper = pscissors = 1/3Minimax theorem [von Neumann 1927]•In general, which one is bigger:–maxσi mins-i ui(σi, s-i) (-i gets to look inside i’s head), or–minσ-i maxsi ui(si, σ-i) (i gets to look inside –i’s head)?•Answer: they are always the same!!!–This quantity is called the value of the game (to player i)•Closely related to linear programming duality•Summarizing: if you can look into the other player’s head (but the other player anticipates that), you will do no better than if the roles were reversed•Only true if we allow for mixed strategies–If you know the other player’s pure strategy in rock-paper-scissors, you will always winSolving for minimax strategies using linear programming•maximize ui•subject to –for any s-i, Σsi psi ui(si, s-i) ≥ ui–Σsi psi = 1Note: linear programs can be solved in polynomial timeGeneral-sum games•You could still play a minimax strategy in general-sum games–I.e. pretend that the opponent is only trying to hurt you•But this is not rational:0, 0 3, 11, 0 2, 1•If Column was trying to hurt Row, Column would play Left, so Row should play Down• In reality, Column will play Right (strictly dominant), so Row should play Up• Is there a better generalization of minimax strategies in zero-sum games to general-sum games?Nash equilibrium [Nash 50]•A vector of strategies (one for each player) is called a strategy profile•A strategy profile (σ1, σ2 , …, σn) is a Nash equilibrium if each σi is a best response to σ-i–That is, for any i, for any σi’, ui(σi, σ-i) ≥ ui(σi’, σ-i)•Note that this does not say anything about multiple agents changing their strategies at the same time•In any (finite) game, at least one Nash equilibrium (possibly using mixed strategies) exists [Nash 50]•(Note - singular: equilibrium, plural: equilibria)Nash equilibria of “chicken”0, 0 -1, 11, -1 -5, -5DSD SSDDS•(D, S) and (S, D) are Nash equilibria–They are pure-strategy Nash equilibria: nobody randomizes–They are also strict Nash equilibria: changing your strategy will make you strictly worse off•No other pure-strategy Nash equilibriaNash equilibria of “chicken”…0, 0 -1, 11, -1 -5, -5DSD S•Is there a Nash equilibrium that uses mixed strategies? Say, where player 1 uses a mixed strategy?•Recall: if a mixed strategy is a best response, then all of the pure strategies that it randomizes over must also be best responses•So we need to make player 1 indifferent between D and S•Player 1’s utility for playing D = -pcS•Player 1’s utility for playing S = pcD - 5pcS = 1 - 6pcS•So we need -pcS = 1 - 6pcS which means pcS = 1/5•Then, player 2 needs to be indifferent as well•Mixed-strategy Nash equilibrium: ((4/5 D, 1/5 S), (4/5 D, 1/5 S))–People may die! Expected utility -1/5 for each playerThe presentation gamePay attention (A)Do not pay attention (NA)Put effort into presentation (E) Do not put effort into presentation (NE)4, 4 -16, -140, -2 0, 0PresenterAudience•Pure-strategy Nash equilibria: (A, E), (NA, NE)•Mixed-strategy Nash equilibrium: ((1/10 A, 9/10 NA), (4/5 E, 1/5 NE))–Utility 0 for audience, -14/10 for presenter–Can see that some equilibria are strictly better for both players than other equilibria, i.e. some equilibria Pareto-dominate other equilibriaThe “equilibrium selection problem”•You are about to play a game that you have never played before with a person that


View Full Document

Duke CPS 296.2 - Minimax strategies, Nash equilibria, correlated equilibria

Download Minimax strategies, Nash equilibria, correlated equilibria
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Minimax strategies, Nash equilibria, correlated equilibria and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Minimax strategies, Nash equilibria, correlated equilibria 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?