Lecture Eleven Probability Models 1 Outline Bayesian Probability Duration Models 2 Bayesian Probability Facts Incidence of the disease in the population is one in a thousand The probability of testing positive if you have the disease is 99 out of 100 The probability of testing positive if you do not have the disease is 2 in a 100 3 Joint and Marginal Probabilities Sick S Healthy H Test Pr S Pr H Pr Test Pr S Pr H Pr Pr S Pr H 4 Filling In Our Facts Sick S Healthy H Pr s 0 001 Pr H 0 999 Test Test 5 Using Conditional Probability Pr H Pr H Pr H 0 02 0 999 01998 Pr S Pr S Pr S 0 99 0 001 00099 Filling In Our Facts Sick S Test Test Healthy H Pr S Pr H 0 00099 0 01998 Pr s 0 001 Pr H 0 999 7 By Sum and By Difference Test Test Sick S Healthy H Pr S 0 00099 Pr S 0 00901 Pr s 0 001 Pr H Pr 0 02097 0 00198 Pr H 0 88802 Pr H 0 999 8 False Positive Paradox Probability of Being Sick If You Test Pr S From Conditional Probability Pr S Pr S Pr 0 00099 0 02097 Pr S 0 0472 Bayesian Probability By Formula Pr S Pr S Pr PR S Pr S Pr Where PR PR S PR S PR H PR H And Using our facts Pr S 0 99 0 001 0 99 001 0 02 999 Pr S 0 00099 0 00099 0 01998 Pr S 0 00099 0 02097 0 0472 Duration Models Exploratory Graphical Estimates Kaplan Meier Functional Form Estimates Exponential Distribution 11 Duration of Post War Economic Expansions in Months 12 Trough Oct 1945 Oct 1949 May 1954 April 1958 Feb 1961 Nov 1970 March 1975 July 1980 Nov 1982 March 1991 Peak Nov 1948 July 1953 August 1957 April 1960 Dec 1969 Nov 1973 January 1980 July 1981 July 1990 March 2000 Duration 37 45 39 24 106 36 58 12 92 120 13 Estimated Survivor Function for Ten Post War Expansions 14 Kaplan Meyer Estimate of Survivor Function Survivor Function at risk ending at risk 15 Duration 0 12 24 36 37 39 45 58 92 106 120 Ending 0 1 1 1 1 1 1 1 1 1 1 At Risk 10 10 9 8 7 6 5 4 3 2 1 Survivor 1 0 9 0 8 0 7 0 6 0 5 0 4 0 3 0 2 0 1 16 0 Figure 2 Estimated Survivor Function for Post War Expansions 1 2 Survivor Function 1 0 8 0 6 0 4 0 2 0 0 20 40 60 80 100 120 140 Duration in Months 17 Exponential Distribution Density f t exp t 0 t Cumulative Distribution Function F t t t F t f u du exp u du 0 0 t F t exp u 0 F t 1 exp t exp 0 F t 1 exp t Survivor Function S t 1 F t exp t Taking logarithms lnS t t Postwar Expansions 0 5 Ln Survivor Function 0 0 20 40 60 80 100 120 0 5 1 1 5 2 y 0 0217x 0 1799 2 R 0 9533 2 5 Duration Months So 19 Exponential Distribution Cont t f t dt Mean 1 Memoryless feature Duration conditional on surviving until t DURC 1 t f t dt S Expected remaining duration duration conditional on surviving until time i e DURC minus so the Or 1 which is equal to the overall mean distribution is memoryless Exponential Distribution Cont Hazard rate or function h t is the probability of failure conditional on survival until that time and is the ratio of the density function to the survivor function It is a constant for the exponential h t f t S t exp t exp t Model Building Reference Ch 20 22 20 2 Polynomial Models There are models where the independent variables xi may appear as functions of a smaller number of predictor variables Polynomial models are one such example 23 Polynomial Models with One Predictor Variable y 0 1x1 2x2 pxp y 0 1x 2x2 pxp 24 Polynomial Models with One Predictor Variable First order model p 1 y 0 1x Second order model p 2 y 0 1x 2 0 2x 2 2 0 Polynomial Models with One Predictor Variable Third order model p 3 y 0 1x 2x2 3 0 3x 3 3 0 Polynomial Models with Two Predictor Variables y First order model y 0 1x1 2x2 y 2 x2 1 0 x2 0 0 2 1 0 x1 x1 20 3 Nominal Independent Variables In many real life situations one or more independent variables are nominal Including nominal variables in a regression analysis model is done via indicator variables An indicator variable I can assume one o out of two values zero or one 1 if the temperature was below 50 11 ififadata firstwere condition collected out ofbefore two is1980 met I 1 if a degree earned is in Finance o the temperature was 50 orFinance more 0000ifififaifdata second were condition collected out after of 1980 is met a degree earned is not intwo 28 Nominal Independent Variables Example Auction Car Price II Example 18 2 revised Xm18 02a Recall A car dealer wants to predict the auction price of a car The dealer believes now that odometer reading and the car color are variables that affect a car s price Three color categories are considered White Silver Other colors Note Color is a nominal variable 29 Nominal Independent Variables Example Auction Car Price II Example 18 2 revised Xm18 02b 1 if the color is white I1 0 if the color is not white 1 if the color is silver I2 0 if the color is not silver The category Other colors is defined by I1 0 I2 0 30 How Many Indicator Variables Note To represent the situation of three possible colors we need only two indicator variables Conclusion To represent a nominal variable with m possible categories we must create m 1 indicator variables 31 Nominal Independent Variables Example Auction Car Price Solution the proposed model is y 0 1 Odometer 2I1 3I2 The data Price 14636 14122 14016 15590 15568 14718 Odometer 37388 44758 45833 30862 31705 34010 I 1 1 1 0 0 0 0 I 2 0 0 0 0 1 1 White car Other color Silver color 32 Example Auction Car Price The Regression Equation From Excel Xm18 02b we get the regression equation PRICE 16701 0555 Odometer 90 48 I 1 295 48 I 2 Price The equ at silver co ion for a lor car 16996 4 8 055 5 Odom eter The eq ua white c tion for a ol 16791 4 or car 8 055 5 Odom eter 16701 0555 O The equa dometer tio other co n for an lor car Odometer Price 16701 0555 Odometer 90 48 0 295 48 1 Price 16701 0555 Odometer 90 48 1 295 48 0 Price …
View Full Document
Unlocking...