Unformatted text preview:

MS&E246:Lecture10RepeatedgamesRamesh JohariWhatisarepeatedgame?A repeated game is:A dynamic game constructed by playingthe same game over and over.It is a dynamic game of imperfect information.Thislecture• Finitely repeated games• Infinitely repeated games• Trigger strategies• The folk theoremStagegameAt each stage, the same game is played:the stage game G.Assume:• G is a simultaneous move game•In G, player i has:•Action set Ai•Payoff Pi(ai, a-i)FinitelyrepeatedgamesG(K) : G is repeated K timesInformation sets:All players observe outcome of each stage.What are:strategies? payoffs? equilibria?HistoryandstrategiesPeriod t history ht: ht= (a(0), …, a(t-1)) wherea(τ) = action profile played at stage τStrategy si: Choice of stage t action si(ht) ∈ Aifor each history ht i.e. ai(t) = si(ht)PayoffsAssume payoff = sum of stage game payoffsExample:Prisoner’sdilemmaRecall the Prisoner’s dilemma:(2,2)(0,4)cooperate(4,0)(1,1)defectcooperatedefectPlayer 1Player 2Example:Prisoner’sdilemmaTwo volunteersFive roundsNo communication allowed!511111Player 2511111Player 1Total54321RoundSPNESuppose aNEis a stage game NE.Any such NE gives a SPNE:Player i plays aiNEat every stage, regardless of history.Question: Are there any other SPNE?SPNEHow do we find SPNE of G(K)?Observe:Subgame starting after history htis identical to G(K - t)SPNE:UniquestagegameNESuppose G has a unique NE aNEThen regardless of period K history hK,last stage has unique NE aNE⇒ At SPNE, si(hK) = aiNESPNE:BackwardinductionAt stage K -1, given s-i(·), player i chooses si(hK -1) to maximize:Pi(si(hK -1), s-i(hK -1)) + Pi(s(hK))payoff at stage K -1 payoff at stage KSPNE:BackwardinductionAt stage K -1, given s-i(·), player i chooses si(hK -1) to maximize:Pi(si(hK -1), s-i(hK -1)) + Pi(aNE)payoff at stage K -1 payoff at stage KWe know: at last stage, aNEis played.SPNE:BackwardinductionAt stage K -1, given s-i(·), player i chooses si(hK -1) to maximize:Pi(si(hK -1), s-i(hK -1))payoff at stage K -1⇒ Stage game NE again!SPNE:ConclusionTheorem:If stage game has unique NE aNE,then finitely repeated game hasunique SPNE:si(ht) = aiNEfor all htExample:Prisoner’sdilemmaMoral: “Cooperate” should never be played.Axelrod’s tournament (1980):Winning strategy was “tit for tat”:Cooperate if and only ifyour opponent did so at the last stageSPNE:MultiplestagegameNENote: If multiple NE exist for stage game NE,there may exist SPNE where actions are played that appear inno stage game NE(See Gibbons, 2.3.A)Infinitelyrepeatedgames• History, strategy definitions same as finitely repeated games•Payoffs:Sum might not be finite!DiscountingDefine payoff as:i.e., discounted sum of stage game payoffsThis game is denoted G(δ, ∞)(Note: (1 - δ) is a normalization)DiscountingTwo interpretations:1. Future payoffs worth lessthan today’s payoffs2. Total # of stages is ageometric random variableFolktheorems• Major problem with infinitely repeated games:If players are patient enough,SPNE can achieve “any” reasonable payoffs.Prisoner’sdilemmaConsider the following strategies, (s1, s2):1. Play C at first stage.2. If ht= ( (C,C), …, (C,C) ), then play C at stage t.Otherwise play D.i.e., punish the other player for defectingPrisoner’sdilemmaNote: G(δ, ∞) is stationaryCase 1: Consider any subgame where at least one player has defected in ht.Then (D,D) played forever.This is NE for subgame, since (D,D) is stage game NE.Prisoner’sdilemmaStep 2: Suppose ht= ( (C,C), …, (C,C) ).Player 1’s options:(a) Follow s1⇒ play C forever(b) Deviate at time t ⇒ play D foreverPrisoner’sdilemmaGiven s2:Playing C forever gives payoff:(1- δ) ( P1(C,C) + δ P1(C,C) + … ) = P1(C,C)Playing D forever gives payoff:(1- δ) ( P1(D,C) + δ P1(D,D) + … )= (1-δ) P1(D,C) + δ P1(D,D)Prisoner’sdilemmaSo cooperate if and only if:P1(C,C) ≥ (1 - δ) P1(D,C) + δ P1(D,D)Note: if P1(C,C) > P1(D,D),then this is always true for δ close to 1Conclude:If δ close to 1, then (s1, s2) is an SPNEPrisoner’sdilemmaIn our game: Need 2 ≥ (1 - δ) 4 + δ ⇒ δ ≥ 2/3So cooperation can be sustained if time horizon is finite but uncertain.TriggerstrategiesIn a (Nash) trigger strategy for player i :1. Play aiat first stage.2. If ht= ( a, …, a ), then play aiat stage t.Otherwise play aiNE.TriggerstrategiesIf a Pareto dominates aNE, trigger strategies will be an SPNEfor large enough δFormally: needPi(a) > (1 - δ) Pi(ai’, a-i) + δ Pi(aNE)for all players i and actions ai’.AchievablepayoffsAchievable payoffs:T = Convex hull of { ( P1(a), P2(a) ) : ai∈ Si}e.g., in Prisoner’s Dilemma:(0,4)(4,0)(2,2)(1,1)P1P2AchievablepayoffsandSPNEA key result in repeated games:Any “reasonable” achievable payoff can be realized in an SPNE of the repeated game,if players are patient enough.Simple proof: generalize prisoner’s dilemma.Randomization• To generalize, suppose before stage tall players observe i.i.d. uniform r.v. Ut• History:ht= (a(0), …, a(t-1), U0, …, Ut)•Players can use Utto coordinatestrategies at stage tRandomizationE.g., suppose players want to achieveP = α P(a) + (1 - α) P(a’)If Ut≤ α : Player i plays aiIf Ut> α : Player i plays ai’We’ll call this the P-achieving action for i.(Uniquely defined for all P ∈ T.)RandomizationE.g., Prisoner’s DilemmaLet P = (3,1).P‐achieving actions:Player 1 plays C if Ut≤ ½and D if Ut> ½Player 2 plays C if Ut≤ ½and C if Ut> ½(0,4)(4,0)(2,2)(1,1)P1P2(3,1)RandomizationandtriggeringSo now suppose P ∈ T and:Pi> Pi(aNE) for all iTrigger strategy: Punish forever (by playing aiNE) if opponent deviates from P-achieving actionRandomizationandtriggeringBoth players using this trigger strategyis again an SPNE for large enough δ.Formally: need(1 - δ) Pi(pi, p-i) + δ Pi> (1 - δ) Pi(ai’, p-i) + δ Pi(aNE)for all players i and actions ai’.(Here p is P-achieving action for player i, andp-iis P-achieving action vector for all other players.)RandomizationandtriggeringBoth players using this trigger strategyis again an SPNE for large enough δ.Formally: need(1 - δ) Pi(pi, p-i) + δ Pi> (1 - δ) Pi(ai’, p-i) + δ Pi(aNE)for all players i and actions ai’.(At time t:LHS is payoff if player i does not deviate after seeing Ut; RHS is payoff if player i deviates to


View Full Document

Stanford MS&E 246 - Repeated games

Download Repeated games
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Repeated games and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Repeated games 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?