Stanford MS&E 316 - Solving Bimatrix Games - D1389877

Home> Schools> Stanford University> Management Science & Engineering (MS&E) > MS&E 316> Solving Bimatrix Games

DOC PREVIEW

Stanford MS&E 316 - Solving Bimatrix Games

School name Stanford University

Course Ms&E 316- Discrete Mathematics and Algorithms

Pages 49

This preview shows page 1-2-3-23-24-25-26-47-48-49 out of 49 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 49 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 49 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 49 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 49 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 49 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 49 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 49 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 49 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 49 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 49 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 49 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Solving Bimatrix GamesThe LCP for a bimatrix gameu = −em+ Ay ≥ 0,x≥ 0,xTu =0,v = −en+ BTx ≥ 0,y≥ 0,yTv =0.The corresponding LCP (q, M ) has dataq ="−em−en#and M ="0 ABT0#where we assume that A and B are positive m × n matrices.Notice that the matrix M is copositive, but not copositive-plus. It isalso obviously not sufficient.The Lemke-Howson AlgorithmAlgorithm (Lemke-Howson).Step 0. Initialization. Input the LCP (q, M) of order m+n as above.Select an index k ∈{1,...,m}. Let s ∈ arg min1≤j≤nbkj.Pivot hvs,xki. [This yields an almost complementary, but in-feasible, solution.] Let r ∈ arg min1≤i≤mais. Pivot hur,ysi.[The solution is now almost complementary and feasible.]The basic pair is (xk,uk), and the nonbasic pair is (ur,xr).If r = k, stop. A solution has been found. Otherwise let xrbe the driving variable.Step 1. Determine the blocking variable (if any). Use the minimumratio test to determine whether there is a basic variable thatblocks the increase of the driving variable. If not, stop.Step 2. Pivoting. The driving variable is blocked. Pivoth blocking variable, driving variable i.If the blocking variable belongs to the basic pair, a solutionto (q,M ) is at hand. Otherwise return to Step 1 using thecomplement of the most recent blocking variable as the newdriving variable.There’s Good News and Bad NewsFirst, the good news:Theorem. The Lemke-Howson Algorithm finds a solution of ev-ery (nondegenerate) instance of the LCP corresponding to a bimatrixgame.Hence the Lemke-Howson algorithm can find a Nash Equilibrium Pointfor a finite two-person nonzero-sum game.The proof is rather similar in spirit to that for the Lemke Algorithm forthe LCP. But note that in the bimatrix game case no covering vectoris needed. This is because of the positivity of the (modified) payoffmatrices and the block structure of the problem.The Bad NewsThe LCP with tableau 1 z1z2z3z4w1−1 0 0 10 20w2−1 0 0 30 15w3−1 10 20 0 0w4−1 30 15 0 0corresponds to a bimatrix game Γ(A, B).z1= x1,z2= x2,z3= y1,z4= y2,w1= u1,w2= u2,w3= v1,w4= v2From the four choices of initial pivot column, we obtain two—andonly two—distinct solutions of this LCP: ¯z =(110, 0,110, 0) and ¯z =(0,115, 0,115). But there is an elusive solution: ¯z =(190,245,190,245).More Bad News: The Famous “Prisoner’s Dilemma” ProblemThis problem originated with Melvin Dresher and Merrill Flood (at theRAND Corportation).The “Prisoner’s Dilemma” is a particular type of bimatrix game prob-lem that has been used to illustrate the shortcomings of the Nashequilibrium concept.The name for the problem and the accompanying story are due toA.W. Tucker. In 1950, he used this in a talk before an audience ofpsychologists here at Stanford.Story goes something like this: Two criminals are caught and heldin separate cells, unable to communicate with each other. They areeach offered the same deal, and each knows that the other is offeredthis deal. They can each confess and be a witness against the otherperson, or they can each deny their guilt. If they both confess, theywill each get 5 years in jail. If one confesses and the other denies guilt,then the one who confessed will get 1 year in jail and the other willget 6 years in jail. If they both deny guilt, they will each get 2 yearsin jail.What should they do?Here we have a case whereBT= A =5162Confess DenyConfess 5 1Deny 6 2BT= A =5162Confess DenyConfess 5 1Deny 6 2It can be shown that there is one and only one Nash Equilibrium pointfor this game: Both criminals confess.BT= A =5162Confess DenyConfess 5 1Deny 6 2It can be shown that there is one and only one Nash Equilibrium pointfor this game: Both criminals confess.(Unless, of course, they are both game theorists who trust each other.)Notice that for each criminal, the strategy Deny is dominated by thestrategy Confess.If both criminals use the other strategy, Deny, each will get a lowersentence.Keller’s (Symmetric) Quadratic Programming AlgorithmWe shall discuss the quadratic programming problem (QP)minimize f(x)=cTx +12xTQxsubject to Ax ≥ bx ≥ 0Assume A ∈ Rm×nand Q = QT∈ Rn×n(otherwise general).The KKT conditions give the LCPu = c + Qx − ATy ≥ 0v = −b + Ax ≥ 0x ≥ 0y ≥ 0uTx =0vTy =0Observations1. The above system is an instance of an LCP; finding a solution ofit is the aim of the algorithm to be discussed here.2. If x is a local minimum of (QP), there exist a y such that (x, y)satisfies the conditions of (LCP). In the case where Q ∈ PSD,is the conditions of (LCP) are also sufficient for x to be a globalminimum for (QP).3. The first two equations define the variables u and v.4. There is a “pairing” b etween ujand xj(j =1,...,n) and a“pairing” between viand yi(i =1,...,m). They are calledcomplementary pairs and each is said to be the complement ofthe other.5. The first two equations of (LCP) imply that2f(x)=cTx + bTy + uTx + vTy.Tabular formLetting θ =2f(x) − uTx − vTy, we can represent the problem at handin a tabular (tableau) form as follows:1 xyθ 0 cTbTu c Q −ATv −b A 0Notice how the (basic) variables u and v are written on the left-handside and the constants and nonbasic variables are written on the right-hand side. This makes it easy to understand the effect that the varia-tion of a nonbasic variable has on the basic variables.Terminal sign configurationsIn a tableau such as the one above, the column headed by “1”iscalled the constant column. It would be the “right-hand side” if allthe (nonbasic) variables were written on the left. Now, if the constantcolumn (c, −b) is nonnegative, then (x, y)=(0, 0) solves (LCP), andwe can terminate the process.Pre-processingWe can carry out a Phase I procedure on the constraints of (QP) soas to transform −b into a nonnegative vector and then modify the restof the data accordingly.Hereafter we assume that the linear inequalities Ax ≥ b, x ≥ 0 havebeen pre-processed so that −b ≥ 0. If this cannot be done, then (QP)is infeasible.Accordingly, we consider the case where cj< 0 for some index j.Now, if qjj≤ 0 and aij≥ 0 for all i = i,...,m then by lettingxj−→ +∞ we generate a halfline of feasible solutions over which theobjective function goes to −∞. In this case, we may also terminatethe process.The aim of the algorithm we are going to study here is to achieve anequivalent pivotal transformation of the tableau such that one or theother of these terminal sign configurations is

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-3-23-24-25-26-47-48-49 out of 49 pages.

Stanford MS&E 316 - Solving Bimatrix Games

Sign up for free to view:

Please select your school