Dec 14 2006 ECON 240A 1 Final L Phillips Answer all five questions They are weighted equally 1 30 For a regression using ordinary least squares OLS y b0 b1 x1 b2 x2 bn xn e we make certain assumptions about the properties of the error term e a List five assumptions about e i E e 0 expected value of error equals zero ii Cov xe 0 error and explanatory variable independent iii Cov ejek 0 j k errors are independent iv Var ej 2 all j errors are homoskedastic v e N 0 2 error is normally distributed b For one way analysis of variance using regression of a quantitative variable against binary dummy explanatory variables zero one we used one of these assumptions to interpret the meaning of the regression coefficients b0 b1 etc Which assumption did we use E e 0 c Which assumption is frequently violated in time series regressions Cov ejek 0 j k d Explain the difference between homoskedasticity and heteroskedasticity homeskedastic errors have same variance heteroskedastic error variance varies across observations e One can obtain estimates of the OLS parameters by minimizing the sum of squared residuals with respect to each regression parameter without making any assumptions about the error term e So why are these assumptions about the error term important The properties of the parameter estimates such as maximum likelihood estimators depends on this assumption as well as hypothesis tests using Student s t distribution and the calculation of confidence intervals 2 30 The number of days spent recovering from a heart attack was studied for a random sample of 300 patients in the US The duration of days recovering was used to calculate the Kaplan Meier estimates of 1 the hazard function 2 the cumulative hazard Dec 14 2006 ECON 240A 2 Final L Phillips function and 3 the survivor function as displayed in Table 2 1 These Kaplan Meier estimates for the hazard rate and the cumulative hazard rate are plotted in Figures 2 1 and 2 2 Table 2 1 Kaplan Meier Estimates of Days Recovering from a Heart Attack US US days ending at risk 8 9 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 47 1 1 4 1 4 10 3 8 8 13 12 11 14 13 9 16 19 11 15 16 11 11 12 13 15 6 10 11 6 5 2 1 1 4 1 2 300 299 298 294 293 289 279 276 268 260 247 235 224 210 197 188 172 153 142 127 111 100 89 77 64 49 43 33 22 16 11 9 8 7 3 2 interval hazard rate ending at risk 0 0033 0 0033 0 0134 0 0034 0 0137 0 0346 0 0108 0 0290 0 0299 0 0500 0 0486 0 0468 0 0625 0 0619 0 0457 0 0851 0 1105 0 0719 0 1056 0 1260 0 0991 0 1100 0 1348 0 1688 0 2344 0 1224 0 2326 0 3333 0 2727 0 3125 0 1818 0 1111 0 1250 0 5714 0 3333 1 0000 cumulative hazard rate 0 0033 0 0066 0 0201 0 0235 0 0371 0 0717 0 0825 0 1115 0 1413 0 1913 0 2399 0 2867 0 3492 0 4111 0 4568 0 5419 0 6524 0 7243 0 8299 0 9559 1 0550 1 1650 1 2998 1 4686 1 7030 1 8255 2 0580 2 3914 2 6641 2 9766 3 1584 3 2695 3 3945 3 9659 4 2993 5 2993 ratio at risk ending at risk 0 997 0 997 0 987 0 997 0 986 0 965 0 989 0 971 0 970 0 950 0 951 0 953 0 938 0 938 0 954 0 915 0 890 0 928 0 894 0 874 0 901 0 890 0 865 0 831 0 766 0 878 0 767 0 667 0 727 0 688 0 818 0 889 0 875 0 429 0 667 0 000 Survivor Function 0 997 0 994 0 980 0 977 0 964 0 930 0 920 0 894 0 867 0 824 0 784 0 747 0 700 0 657 0 627 0 574 0 510 0 473 0 423 0 370 0 333 0 297 0 257 0 213 0 163 0 143 0 110 0 073 0 053 0 037 0 030 0 027 0 023 0 010 0 007 0 000 Dec 14 2006 ECON 240A 3 Final L Phillips Dec 14 2006 ECON 240A 4 Final L Phillips The exponential distribution is often used for duration studies The density function f t for the exponential is f t e t where the reciprocal of would be the mean recovery time The cumulative distribution function F t or probability the recovery time lasts up to time t is F t 1 e t The survivor function S t i e the probability that recovery time is longer than t is S t 1 F t e t The hazard rate h t or conditional probability of a recovering heart attack patient returning to work after recovering for t days is h t f t S t which for the exponential is h t t The cumulative hazard function H t h t dt and for the exponential is a linear 0 function of recovery time H t t a From Table 2 1 and Figure 2 1 is the conditional probability of a heart attack patient returning to work given they have been recovering for t days constant decreasing or increasing increasing Dec 14 2006 ECON 240A 5 Final L Phillips b From Table 2 1 and Figure 2 2 does the cumulative hazard function look like it is a linear function of recovery time no an exponential or power function c Does the exponential appear to be the function to use to fit this US recovery time data no h t is not constant and H t is not linear d If you use the estimated slope from the linear fit of the cumulative hazard function in Figure 2 2 to estimate the mean recovery time in the US what value do you get rounded to the nearest day 1 0 1223 8 days e From Table 2 1 does this estimate in part d of the average number of recovery days make any sense No Is it too high too low or just right too low 3 30 The number of days spent recovering from a heart attack before returning to work was also collected for a random sample of 300 patients in Canada The author of the 7th edition of the text also in the 5th and 6th editions analyses the data and asks the question Can we conclude that recovery is faster in the United States He proceeds by using an equal variances t test of the differences in the means using the following data You can round to the second decimal place Table 3 …
View Full Document
Unlocking...