Last update: April 15, 2010Rational decisionsCMSC 421: Chapter 16CMSC 421: Chapter 16 1Outline♦ Rational preferences♦ Utilities♦ Money♦ Multiattribute utilities♦ Decision networks♦ Value of informationCMSC 421: Chapter 16 2PreferencesAn agent chooses among prizes (A, B, etc.) and lotteries, i.e., situationswith uncertain prizesLottery L = [p, A; (1 − p), B]Lp1−pABNotation:A B A preferred to BA ∼ B indifference between A and BA∼B B not preferred to ACMSC 421: Chapter 16 3Rational preferencesIdea: preferences of a rational agent must obey constraints.Rational preferences ⇒behavior describable as maximization of expected utilityConstraints:Orderability :(A B) ∨ (B A) ∨ (A ∼ B)Transitivity :(A B) ∧ (B C) ⇒ (A C)Continuity :A B C ⇒ ∃ p [p, A; 1 − p, C] ∼ BSubstitutability :A ∼ B ⇒ [p, A; 1 − p, C] ∼ [p, B; 1 − p, C]Monotonicity :A B ⇒ (p ≥ q ⇔ [p, A; 1 − p, B]∼[q, A; 1 − q, B])CMSC 421: Chapter 16 4Rational preferences contd.What happens if an agent’s preferences violate the constraints?Example: intransitive preferencesIf B C, then an agent who has Cwould trade C plus some moneyto get BIf A B, then an agent who has Bwould trade B plus some moneyto get AIf C A, then an agent who has Awould trade A plus some moneyto get CAB C1c1c1cCMSC 421: Chapter 16 5Rational preferences contd.What happens if an agent’s preferences violate the constraints?It leads to self-evident irrationalityExample: intransitive preferencesIf B C, then an agent who has Cwould trade C plus some moneyto get BIf A B, then an agent who has Bwould trade B plus some moneyto get AIf C A, then an agent who has Awould trade A plus some moneyto get CAB C1c1c1cAn agent with intransitive preferences can be induced to give away all itsmoneyCMSC 421: Chapter 16 6Maximizing expected utilityTheorem (Ramsey, 1931; von Neumann and Morgenstern, 1944):Given preferences satisfying the constraints,there exists a real-valued function U such thatU(A) ≥ U(B) ⇔ A∼BU([p1, S1; . . . ; pn, Sn]) = ΣipiU(Si)MEU principle:Choose the action that maximizes the expected utilityNote: an agent can maximize the expected utility without ever representingor manipulating utilities and probabilitiesE.g., a lookup table to play tic-tac-toe perfectlyCMSC 421: Chapter 16 7Human utilitiesUtilities map states to real numbers. Which numbers?Standard approach to assessing human utilities:Compare a given state A to a standard lottery Lpthat has• “best possible prize” umaxwith probability p• “worst possible catastrophe” uminwith probability (1 − p)Adjust lottery probability p until A ∼ LpHow muchwould you payto avoid a1/1,000,000 chance of death?L0.9999990.000001continue as beforeinstant deathpay $30~CMSC 421: Chapter 16 8Human utilitiesUtilities map states to real numbers. Which numbers?Standard approach to assessing human utilities:Compare a given state A to a standard lottery Lpthat has• “best possible prize” umaxwith probability p• “worst possible catastrophe” uminwith probability (1 − p)Adjust lottery probability p until A ∼ LpJudging from people’s actions,they will pay about$20 to avoid a1/1,000,000 chance of deathL0.9999990.000001continue as beforeinstant deathpay $30~-One micromort≈ P (accidental death in 370 km of car travel)≈ P (accidental death in 9700 km of train travel)CMSC 421: Chapter 16 9Utility scalesNote: behavior is invariant w.r.t. positive linear transformationLetU0(x) = k1U(x) + k2where k1> 0Then U0models the same preferences that U does.Normalized utilities:define U0such that 0 ≤ U0(x) ≤ 1 for all xCMSC 421: Chapter 16 10The utility of moneyFor each amount x, adjust p until half the class votes for each option:win $10,000win nothingp1–pOption 2: lottery LOption 1: you win $x. -0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10,000x60.00.20.40.60.81.0p••CMSC 421: Chapter 16 11What the book saysMoney does not behave as a utility functionGiven a lottery L with expected monetary value EMV (L),usually U(L) < U(EMV (L)), i.e., people are risk-averseUtility curve: for what probability p am I indifferent between a prize x anda lottery [p, $M; (1 − p), $0] for large M?Typical empirical data, extrapolated with risk-prone behavior:+U+$−150,000 800,000oooooooooooooooCMSC 421: Chapter 16 12Decision networksAdd action nodes and utility nodes to causal networksto enable rational decision makingUAirport SiteDeathsNoiseCostLitigationConstructionAir TrafficAlgorithm:For every possible value of the action nodecompute E(utility node | action, evidence)Return MEU actionCMSC 421: Chapter 16 13Multiattribute utilityHow can we handle utility functions of many variables X1. . . Xn?E.g., what is U(Deaths, Noise, Cost)?How can complex utility functions be assessed frompreference behavior?Idea 1: identify conditions (e.g., dominance) under which decisions can bemade without complete identification of U(x1, . . . , xn)Idea 2: identify various types of independence in preferencesand derive consequent canonical forms for U(x1, . . . , xn)CMSC 421: Chapter 16 14Strict dominanceTypically define attributes such that U is monotonic in each attributeStrict dominance: choice B strictly dominates choice A iff∀ i Xi(B) ≥ Xi(A) (and hence U(B) ≥ U(A))1X 2X ABCD1X 2X ABCThis regiondominates ADeterministic attributes Uncertain attributesStrict dominance seldom holds in practiceCMSC 421: Chapter 16 15Stochastic dominance00.20.40.60.811.2-6 -5.5 -5 -4.5 -4 -3.5 -3 -2.5 -2ProbabilityNegative costS1S200.20.40.60.81-6 -5.5 -5 -4.5 -4 -3.5 -3 -2.5 -2ProbabilityNegative costS1S2Choices S1and S2with continuous distributions p1and p2S1stochastically dominates S2iff ∀ t P (S1≤ t) ≤ P (S2≤ t),i.e., ∀ tZt−∞p1(x)dx ≤Zt−∞p2(t)dtIf S1stochastically dominates S2and U is monotonic in x, thenEU(S1) =Z∞−∞p1(x)U(x)dx ≥Z∞−∞p2(x)U(x)dx = EU(S2)If p1, p2are discrete, use sums instead of integralsMultiattribute case: stochastic dominance on all attributes ⇒ optimalCMSC 421: Chapter 16 16Stochastic dominance contd.Stochastic dominance can often be determined withoutexact distributions using qualitative reasoningE.g., construction cost increases with distance from cityS1is closer to the city than S2⇒ S1stochastically dominates S2on costE.g., injury increases with collision speedCan annotate belief networks with stochastic dominance information:X+−→Y (X positively
View Full Document