UMD CMSC 421 - Game Theory - D773155

Home> Schools> University of Maryland, College Park> Computer Science (CMSC) > CMSC 421> Game Theory

DOC PREVIEW

UMD CMSC 421 - Game Theory

School name University of Maryland, College Park

Course Cmsc 421- Introduction to Artificial Intelligence

Pages 51

This preview shows page 1-2-3-24-25-26-27-49-50-51 out of 51 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 51 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 51 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 51 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 51 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 51 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 51 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 51 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 51 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 51 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 51 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 51 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Nau: Game Theory 1 Game Theory CMSC 421, Section 17.6Nau: Game Theory 2 Introduction  In Chapter 6 we looked at 2-player perfect-information zero-sum games  We’ll now look at games that might have one or more of the following:  > 2 players  imperfect information  nonzero-sum outcomesNau: Game Theory 3 The Prisoner’s Dilemma  Scenario: the police have arrested two suspects for a crime.  They tell each prisoner they’ll reduce his/her prison sentence if he/she betrays the other prisoner.  Each prisoner must choose between two actions:  cooperate with the other prisoner, i.e., don’t betray him/her  defect (betray the other prisoner).  Payoff = – (years in prison):  Each player has only two strategies, each of which is a single action  Non-zero-sum  Imperfect information: neither player knows the other’s move until after both players have moved Agent 2 Agent 1 C D C –2, –2 –5, 0 D 0, –5 –4, –4 Prisoner’s DilemmaNau: Game Theory 4 The Prisoner’s Dilemma  Add 5 to each payoff, so that the numbers are all ≥ 0  These payoffs encode the same preferences  Note: the book represents payoff matrices in a non-standard way  It puts Agent 1 where I have Agent 2, and vice versa Prisoner’s Dilemma: Agent 2 Agent 1 C D C 3, 3 0, 5 D 5, 0 1, 1 Prisoner’s Dilemma: Agent 2 Agent 1 C D C –2, –2 –5, 0 D 0, –5 –4, –4Nau: Game Theory 5 How to reason about games?  In single-agent decision theory, look at an optimal strategy  Maximize the agent’s expected payoff in its environment  With multiple agents, the best strategy depends on others’ choices  Deal with this by identifying certain subsets of outcomes called solution concepts  Some solution concepts:  Dominant strategy equilibrium  Pareto optimality  Nash equilibriumNau: Game Theory 6 Strategies  Suppose the agents agent 1, agent 2, …, agent n  For each i, let Si = {all possible strategies for agent i}  si will always refer to a strategy in Si  A strategy profile is an n-tuple S = (s1, …, sn), one strategy for each agent  Utility Ui(S) = payoff for agent i if the strategy profile is S  si strongly dominates si' if agent i always does better with si than si'  si weakly dominates si' if agent i never does worse with si than si', and there is at least one case where agent i does better with si than si',Nau: Game Theory 7 Dominant Strategy Equilibrium  si is a (strongly, weakly) dominant strategy if it (strongly, weakly) dominates every si' ∈ Si  Dominant strategy equilibrium:  A set of strategies (s1, …, sn) such that each si is dominant for agent i  Thus agent i will do best by using si rather than a different strategy, regardless of what strategies the other players use  In the prisoner’s dilemma, there is one dominant strategy equilibrium: both players defect Prisoner’s Dilemma: Agent 2 Agent 1 C D C 3, 3 0, 5 D 5, 0 1, 1Nau: Game Theory 8 Pareto Optimality  Strategy profile S Pareto dominates a strategy profile Sʹ′ if  no agent gets a worse payoff with S than with Sʹ′, i.e., Ui(S) ≥ Ui(Sʹ′) for all i ,  at least one agent gets a better payoff with S than with Sʹ′, i.e., Ui(S) > Ui(Sʹ′) for at least one i  Strategy profile s is Pareto optimal, or strictly Pareto efficient, if there’s no strategy s' that Pareto dominates s  Every game has at least one Pareto optimal profile  Always at least one Pareto optimal profile in which the strategies are pureNau: Game Theory 9 Example The Prisoner’s Dilemma  (C,C) is Pareto optimal  No profile gives both players a higher payoff  (D,C) is Pareto optimal  No profile gives player 1 a higher payoff  (D,C) is Pareto optimal - same argument  (D,D) is Pareto dominated by (C,C)  But ironically, (D,D) is the dominant strategy equilibrium Agent 2 Agent 1 C D C 3, 3 0, 5 D 5, 0 1, 1 Prisoner’s DilemmaNau: Game Theory 10 Pure and Mixed Strategies  Pure strategy: select a single action and play it  Each row or column of a payoff matrix represents both an action and a pure strategy  Mixed strategy: randomize over the set of available actions according to some probability distribution  Let Ai = {all possible actions for agent i}, and ai be any action in Ai  si (aj ) = probability that action aj will be played under mixed strategy si  The support of si is  support(si) = {actions in Ai that have probability > 0 under si}  A pure strategy is a special case of a mixed strategy  support consists of a single action  Fully mixed strategy: every action has probability > 0  i.e., support(si) = AiNau: Game Theory 11 Expected Utility  A payoff matrix only gives payoffs for pure-strategy profiles  Generalization to mixed strategies uses expected utility  Let S = (s1, …, sn) be a profile of mixed strategies  For every action profile (a1, a2, …, an), multiply its probability and its utility • Ui (a1, …, an) s1(a1) s2(a2) … sn(an)  The expected utility for agent i is € Uis1,…, sn( )= Uia1,…,an( )(a1,…,an) ∈A∑s1a1( )s2a2( )… snan( )Nau: Game Theory 12 Best Response  Some notation:  If S = (s1, …, sn) is a strategy profile, then S−i = (s1, …, si−1, si+1, …, sn), • i.e., S–i is strategy profile S without agent i’s strategy  If si' is any strategy for agent i, then • (si' , S−i ) = (s1, …, si−1, si', si+1, …, sn)  Hence (si , S−i ) = S  si is a best response to S−i if Ui (si , S−i ) ≥ Ui (si', S−i ) for every strategy si' available to agent i  si is a unique best response to S−i if Ui (si , S−i ) > Ui (si', S−i ) for every si' ≠ siNau: Game Theory 13  A strategy profile s = (s1, …, sn) is a Nash equilibrium if for every i,  si is a best response to S−i , i.e., no agent can do better by unilaterally changing his/her strategy  Theorem (Nash, 1951): Every game with a finite number of agents and action profiles has at least one Nash equilibrium  In the Prisoner’s Dilemma, (D,D) is a Nash equilibrium  If either agent unilaterally switches to a different

View Full Document