DOC PREVIEW
Rutgers University CS 440 - Decisions under uncertainty

This preview shows page 1-2-3-22-23-24-45-46-47 out of 47 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 47 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 47 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 47 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 47 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 47 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 47 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 47 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 47 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 47 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 47 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Decisions under uncertaintyOutlineDecision makingSimple decision problemValue functionPreferencesDesired properties for preferences over lotteriesProperties of (rational) preferencePreference & expected utilityProperties of utilityUtility scalesUtility vs MoneyAttitudes toward riskHuman judgment under uncertaintyStudent group utilityTechnology forecastingMaximizing expected utilityMulti-attribute utilitiesDecision graphs / Influence diagramsOptimal policySlide 21Slide 22Value of informationOptimal policy with additional evidenceSlide 25Value of perfect informationVPI (cont’d)Properties of VPIExampleExample (cont’d)Sequential DecisionsSequential decisionsPartially-observed Markov decision processes (POMDP)POMDP ProblemsExample (POMDPs)Example #2 (POMDP)Markov decision processes (MDPs)ExamplesMDP FundamentalsUtility & utility maximization in MDPsSlide 41Bellman equations & value iteration algorithmProof of Bellman update equationSlide 44Computing MDP state utilities Value iterationSlide 46Policy iterationRutgers CS440, Fall 2003Decisions under uncertaintyReading: Ch. 16, AIMA 2nd Ed.Rutgers CS440, Fall 2003Outline•Decisions, preferences, utility functions•Influence diagrams•Value of informationRutgers CS440, Fall 2003Decision making•Decisions – an irrevocable allocation of domain resources•Decisions should be made so as to maximize expected utility•Questions:–Why make decisions based on average or expected utility?–Why can one assume that utility functions exist?–Can an agent act rationally by expressing preferences between states without giving them numeric values?–Can every preference structure be captured by assigning a single number to every state?Rutgers CS440, Fall 2003Simple decision problem•Party decision problem: inside or outside?ActionstatestateDryDryWetWetINOUTRegretReliefPerfect !DisasterRutgers CS440, Fall 2003Value function•Numerical score over all possible states of the worldAction Weather ValueOUT Dry $100IN Wet $60IN Dry $50OUT Wet $0Rutgers CS440, Fall 2003Preferences•Agent chooses among prizes (A,B,…) and lotteries (situations with uncertain prizes).2.8$40,000$0.25.75$30,000$0L1 = ( .2, $40000; .8, $0 ) L2 = ( .25, $30000; .75, $0 )~A  B A is preferred to BA  B B is preferred to AA ~ B indifference between A & BRutgers CS440, Fall 2003Desired properties for preferences over lotteries•Prefer $100 over $0 AND p < q, thenp1-p$100$0q1-q$100$0L1 = ( p, $100; 1-p, $0 ) L2 = ( .q, $100; 1-q, $0 )Rutgers CS440, Fall 2003Properties of (rational) preferenceLead to rational agent behavior1. Orderability( A  B ) V ( A  B ) V ( A ~ B )2. Transitivity( A  B ) ^ ( B  C )  ( A  C ) 3. ContinuityA  B  C  p, ( p, A; (1-p) C ) ~ B4. SubstitutabilityA~B  p, ( p,A; (1-p), C ) ~ ( p,B; (1-p), C )5. MonotonicityA  B  ( p > q  ( p,A; (1-p)B )  ( q,A; (1-q),B ) )Rutgers CS440, Fall 2003Preference & expected utility•Properties of preference lead to existence (Ramsey 1931, von Neumann & Morgenstern 1944) of utility function U such thatp1-p$100$0q1-q$100$0L1 = ( p, $100; 1-p, $0 ) L2 = ( .q, $100; 1-q, $0 )p U($100) + (1-p) U($0)q U($100) + (1-q) U($0)<IFFEXPECTED UTILITY of L2, EU(L2)EXPECTED UTILITY of L1, EU(L1)Rutgers CS440, Fall 2003Properties of utility•Utility is a function that maps states to real numbers•Standard approach to assessing utilities of states:1. Compare state A to a standard lottery L = ( p,Ubest, 1-p, Uworst)Ubest – best possible eventUworst – worst possible event2. Adjust p until A ~ L 0.9999990.000001Continue asbeforeInstantdeath$30 ~Rutgers CS440, Fall 2003Utility scales•Normalized utilities: Ubest = 1.0, Uworst = 0.0•Micromorts: one-millionth chance of death–useful for Russian roulette, paying to reduce product risks, etc.•QALYs: quality-adjusted life years–useful for medical decisions involving substantial risk•Note: behavior is invariant w.r.t. positive linear transformationU’(s) = A U(s) + B, A > 0Rutgers CS440, Fall 2003Utility vs Money•Utility is NOT monetary payoff.8.2$40,000$010$30,000$0EMV(L1) = $32,000EMV(L2) = $30,000>iiMVpLEMV )(Rutgers CS440, Fall 2003Attitudes toward risk$ rewardU( $reward )$500U( $500 )$1000$4000.50.5$1000$0Insurance risk premiumcertain monetary equivalentLU( L )U convex – risk averseU linear – risk neutralU concave – risk seekingRutgers CS440, Fall 2003Human judgment under uncertainty•Is decision theory compatible with human judgment under uncertainty?•Are people “experts” in reasoning under uncertainty? How well do they perform? What kind of heuristics do they use?.2.8$40,000$0.25.75$30,000$0.8.2$40,000$010$30,000$0.2 U($40k) > .25 U($30k).8 U($40k) > U($30k).8 U($40k) < U($30k)Rutgers CS440, Fall 2003Student group utility00.10.20.30.40.50.60.70.80.91500 2000 4000 6000 8000 10000•For each $ amount, adjust p until half the class votes for lottery ($10000)Rutgers CS440, Fall 2003Technology forecasting•“I think there is a world market for about five computers.”- Thomas J. Watson, Sr.Chairman of the Board of IBM, 1943•“There doesn't seem to be any real limit to the growth of the computer industry.”- Thomas J. Watson, Sr.Chairman of the Board of IBM, 1968Rutgers CS440, Fall 2003Maximizing expected utilityActionstatestateDryDryWetWetINOUT0.70.70.30.3EU(IN) = 0.7 * 0.632 + 0.3 * 0.699 = 0.6521EU(OUT) = 0.7 * 0.865 + 0.3 * 0 = 0.60550.65210.6055U($50) = 0.632U($60) = 0.699U($100) = 0.865U($0) = 0UtilityValue$50$60$100$0Action* = arg MEU(IN,OUT) = arg max{ EU(IN), EU(OUT) } = INRutgers CS440, Fall 2003Multi-attribute utilities•Many aspects of an outcome combine to determine our preferences:–vacation planning: cost, flying time, beach quality, food quality, etc.•Medical decision making: risk of death (micromort), quality of life (QALY), cost of treatment, etc. •For rational decision making, must combine all relevant factors into single utility function.U(a,b,c,…)= f[ f1(a), f2(b), … ] where f is a simple function such as addition•f=+, In case of mutual preference independence which occurs when it is always preferable to increase the value of an attribute given all other attributes are fixedRutgers CS440, Fall 2003Decision graphs / Influence diagramsearthquake burglaryalarmcallgoodsrecoveredgohome?UtilitymissmeetingActionnodeUtility nodenewscastflood_decision.netRutgers CS440, Fall 2003Optimal policyearthquake


View Full Document

Rutgers University CS 440 - Decisions under uncertainty

Download Decisions under uncertainty
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Decisions under uncertainty and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Decisions under uncertainty 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?