Lecture Eleven Probability Models 1 Outline Bayesian Probability Duration Models 2 Bayesian Probability Facts Incidence of the disease in the population is one in a thousand The probability of testing positive if you have the disease is 99 out of 100 The probability of testing positive if you do not have the disease is 2 in a 100 3 Joint and Marginal Probabilities Sick S Healthy H Test Pr S Pr H Pr Test Pr S Pr H Pr Pr S Pr H 4 Filling In Our Facts Sick S Healthy H Pr s 0 001 Pr H 0 999 Test Test 5 Using Conditional Probability Pr H Pr H Pr H 0 02 0 999 01998 Pr S Pr S Pr S 0 99 0 001 00099 Filling In Our Facts Sick S Test Test Healthy H Pr S Pr H 0 00099 0 01998 Pr s 0 001 Pr H 0 999 7 By Sum and By Difference 8 False Positive Paradox Probability of Being Sick If You Test Pr S From Conditional Probability Pr S Pr S Pr 0 00099 0 02097 Pr S 0 0472 Bayesian Probability By Formula Pr S Pr S Pr PR S Pr S Pr Where PR PR S PR S PR H PR H And Using our facts Pr S 0 99 0 001 0 99 001 0 02 999 Pr S 0 00099 0 00099 0 01998 Pr S 0 00099 0 02097 0 0472 Duration Models Exploratory Graphical Estimates Kaplan Meier Functional Form Estimates Exponential Distribution 11 Duration of Post War Economic Expansions in Months 12 13 14 15 Estimated Survivor Function for Ten Post War Expansions 16 Kaplan Meyer Estimate of Survivor Function Survivor Function at risk ending at risk 17 Duration 0 12 24 36 37 39 45 58 92 106 120 Ending 0 1 1 1 1 1 1 1 1 1 1 At Risk 10 10 9 8 7 6 5 4 3 2 1 Survivor 1 0 9 0 8 0 7 0 6 0 5 0 4 0 3 0 2 0 1 18 0 Figure 2 Estimated Survivor Function for Post War Expansions 1 2 Survivor Function 1 0 8 0 6 0 4 0 2 0 0 20 40 60 80 100 120 140 Duration in Months 19 Exponential Distribution Density f t exp t 0 t Cumulative Distribution Function F t t t F t f u du exp u du 0 0 t F t exp u 0 F t 1 exp t exp 0 F t 1 exp t Survivor Function S t 1 F t exp t Taking logarithms lnS t t Postwar Expansions 0 5 Ln Survivor Function 0 0 20 40 60 80 100 120 0 5 1 1 5 2 y 0 0217x 0 1799 2 R 0 9533 2 5 Duration Months So 21 Exponential Distribution Cont t f t dt Mean 1 Memoryless feature Duration conditional on surviving until t DURC 1 t f t dt S Expected remaining duration duration conditional on surviving until time i e DURC minus so the Or 1 which is equal to the overall mean distribution is memoryless Exponential Distribution Cont Hazard rate or function h t is the probability of failure conditional on survival until that time and is the ratio of the density function to the survivor function It is a constant for the exponential h t f t S t exp t exp t Model Building Reference Ch 20 Ch 18 8th Ed 24 20 2 Polynomial Models There are models where the independent variables xi may appear as functions of a smaller number of predictor variables Polynomial models are one such example 25 Polynomial Models with One Predictor Variable y 0 1x1 2x2 pxp y 0 1x 2x2 pxp 26 Polynomial Models with One Predictor Variable First order model p 1 y 0 1x Second order model p 2 y 0 1x 2 0 2x 2 2 0 Polynomial Models with One Predictor Variable Third order model p 3 y 0 1x 2x2 3 0 3x 3 3 0 Polynomial Models with Two Predictor Variables y First order model y 0 1x1 2x2 y 2 x2 1 0 x2 0 0 2 1 0 x1 x1 20 3 Nominal Independent Variables In many real life situations one or more independent variables are nominal Including nominal variables in a regression analysis model is done via indicator variables An indicator variable I can assume one o out of two values zero or one 1 if the temperature was below 50 11 ififadata firstwere condition collected out ofbefore two is1980 met I 1 if a degree earned is in Finance o the temperature was 50 orFinance more 0000ifififaifdata second were condition collected out after of 1980 is met a degree earned is not intwo 30 Nominal Independent Variables Example Auction Car Price II Example 18 2 revised Xm18 02a Recall A car dealer wants to predict the auction price of a car The dealer believes now that odometer reading and the car color are variables that affect a car s price Three color categories are considered White Silver Other colors Note Color is a nominal variable 31 Nominal Independent Variables Example Auction Car Price II Example 18 2 revised Xm18 02b 1 if the color is white I1 0 if the color is not white 1 if the color is silver I2 0 if the color is not silver The category Other colors is defined by I1 0 I2 0 32 How Many Indicator Variables Note To represent the situation of three possible colors we need only two indicator variables Conclusion To represent a nominal variable with m possible categories we must create m 1 indicator variables 33 Nominal Independent Variables Example Auction Car Price Solution the proposed model is y 0 1 Odometer 2I1 3I2 The data Price 14636 14122 14016 15590 15568 14718 Odometer 37388 44758 45833 30862 31705 34010 I 1 1 1 0 0 0 0 I 2 0 0 0 0 1 1 White car Other color Silver color 34 Example Auction Car Price The Regression Equation From Excel Xm18 02b we get the regression equation PRICE 16701 0555 Odometer 90 48 I 1 295 48 I 2 Price The equ at silver co ion for a lor car 16996 4 8 055 5 Odom eter The eq ua white c tion for a ol 16791 4 or car 8 055 5 Odom eter 16701 0555 O The equa dometer tio other co n for an lor car Odometer Price 16701 0555 Odometer 90 48 0 295 48 1 Price 16701 0555 Odometer 90 48 1 295 48 0 Price 16701 0555 Odometer 45 2 0 148 0 35 Example Auction Car Price The Regression Equation From Excel we get the regression equation PRICE 16701 0555 Odometer 90 48 I 1 295 48 I 2 For one additional mile the auction price decreases by 5 55 cents A white car sells on the average for 90 48 more than a car of the Other color category A silver color car sells on the average for 295 48 more than a …
View Full Document
Unlocking...