UMBC CMCS 471 - Making Simple Decisions - D2088273

Home> Schools> University of Maryland, Baltimore County> (CMCS) > CMCS 471> Making Simple Decisions

DOC PREVIEW

UMBC CMCS 471 - Making Simple Decisions

School name University of Maryland, Baltimore County

Course Cmcs 471- Introduction to Artificial Intelligence

Pages 15

This preview shows page 1-2-3-4-5 out of 15 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 15 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 15 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 15 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 15 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 15 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 15 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Making Simple DecisionsTopicsUncertain Outcomes of ActionsExpected UtilityOne State/One Action ExampleOne State/Two Actions ExampleMEU PrincipleComparing outcomesDecision networksDecision network representationR&N exampleEvaluating decision networksExercise: Umbrella networkValue of Perfect Information (VPI)VPI exercise: Umbrella networkMaking Simple Making Simple DecisionsDecisionsChapter 16Some material borrowed from Jean-Claude Latombe and Daphne Koller by way of Marie desJadines,Topics•Decision making under uncertainty–Utility theory and rationality–Expected utility–Utility functions–Multiattribute utility functions–Preference structures–Decision networks–Value of informationUncertain Outcomes of ActionsUncertain Outcomes of Actions•Some actions may have uncertain outcomes–Action: spend $10 to buy a lottery which pays $1000 to the winner–Outcome: {win, not-win}•Each outcome is associated with some merit (utility)–Win: gain $990–Not-win: lose $10•There is a probability distribution associated with the outcomes of this action (0.0001, 0.9999). •Should I take this action?Expected UtilityExpected Utility•Random variable X with n values x1,…,xn and distribution (p1,…,pn)–X is the outcome of performing action A (i.e., the state reached after A is taken)–E.g., P(X = win|play) = 0.0001; P(X = not-win|play) = 0.9999•Function U of X–U is a mapping from states to numerical utilities (values)•The expected utility of performing action A is EU[A] = i=1,…,n p(xi|A)U(xi)–E.g., EU[play] = 0.0001*990 + 0.9999*(-10) = 0.099 – 9.999 = - 9.09Utility of each outcomeProbability of each outcomes0s3s2s1A10.2 0.7 0.1100 50 70EU(S0|A1) = 100 x 0.2 + 50 x 0.7 + 70 x 0.1 = 20 + 35 + 7 = 62One State/One Action ExampleOne State/One Action ExampleU(s)P(s)s0s3s2s1A10.2 0.7 0.1100 50 70A2s40.2 0.880• EU1(S0|A1) = 62• EU2(S0|A2) = 74• EU(S0) = max{EU1(S0|A1),EU2(S0|A2)} = 74One State/Two Actions ExampleOne State/Two Actions ExampleMEU Principle•Decision theory: A rational agent should choose the action that maximizes the agent’s expected utility•Maximizing expected utility (MEU) is a normative criterion for rational choices of actions •Must have complete model of:–Actions–States–Utilities•Even if you have a complete model, will be computationally intractableComparing outcomes•Which is better:A = Being rich and sunbathing where it’s warmB = Being rich and sunbathing where it’s coolC = Being poor and sunbathing where it’s warmD = Being poor and sunbathing where it’s cool•Multiattribute utility theory–A clearly dominates B: A > B. A > C. B > D, C > D. A > D. What about B vs. C?–Simplest case: Additive value function (just add the individual attribute utilities)–Others use weighted utility, based on the relative importance of these attributes–Learning the combined utility function (similar to joint prob. table)Decision networks•Extend Bayesian nets to handle actions and utilities–a.k.a. influence diagrams•Make use of Bayesian net inference for computing posterior probability distributions•Useful application: Value of InformationDecision network representation•Chance nodes: random variables, as in Bayesian nets•Decision nodes: actions that decision maker can take•Utility/value nodes: the utility of the outcome state.R&N exampleEvaluating decision networks•Set the evidence variables for the current state.•For each possible value of the decision node (assume just one):–Set the decision node to that value.–Calculate the posterior probabilities for the parent nodes of the utility node, using BN inference.–Calculate the resulting utility for the action.•Return the action with the highest utility.Exercise: Umbrella networkWeatherForecastUmbrellaHappinesstake/don’t takef w p(f|w)sunny rain 0.3rainy rain 0.7sunny no rain 0.8rainy no rain 0.2P(rain) = 0.4U(lug, rain) = -25U(lug, ~rain) = 0U(~lug, rain) = -100U(~lug, ~rain) = 100Lug umbrellaP(lug|take) = 1.0P(~lug|~take)=1.0Value of Perfect Information (VPI)•How much is it worth to observe (with certainty) a random variable X?•Suppose the agent’s current knowledge is E. The value of the current best action  is:EU(α | E) = maxA ∑i U(Resulti(A)) p(Resulti(A) | E, Do(A))•The value of the new best action after observing the value of X is: EU(α’ | E,X) = maxA ∑i U(Resulti(A)) p(Resulti(A) | E, X, Do(A))•…But we don’t know the value of X yet, so we have to sum over its possible values•The value of perfect information for X is therefore: VPI(X) = ( ∑k p(xk | E) EU(αxk | xk, E)) – EU (α | E)Probability ofeach value of XExpected utilityof the best actiongiven that value of XExpected utilityof the best actionif we don’t know X(i.e., currently)VPI exercise: Umbrella networkWeatherForecastUmbrellaHappinesstake/don’t takef w p(f|w)sunny rain 0.3rainy rain 0.7sunny no rain 0.8rainy no rain 0.2P(rain) = 0.4U(lug, rain) = -25U(lug, ~rain) = 0U(~lug, rain) = -100U(~lug, ~rain) = 100Lug umbrellaP(lug|take) = 1.0P(~lug|~take)=1.0What’s the value of knowing the weather forecast before leaving

View Full Document