New version page

Linear model

This preview shows page 1-2 out of 7 pages.

View Full Document

End of preview. Want to read all 7 pages?

View Full Document
Unformatted text preview:

Comparison of estimation procedures for linear and non-linear outcome I simulated a simple dataset from a cross-over trial comparing treatment to placebo (trt = 1 for treatment, 0 for placebo). The period variable indicates the ordering of the treatments. There are two outcomes of interest: alcohol consumption (Y, a continuous measure) and alcohol dependence (AD, a binary measure: 1 yes, 0 no). These hypothetical data are available at the course website: http://www.biostat.jhsph.edu/~ejohnson/multilevel.htm Linear model of Y as a function of period and treatment 1. First regress Y on period and treatment ignoring the correlation in the data (i.e. ordinary least squares. reg Y period trt Source | SS df MS Number of obs = 30 -------------+------------------------------ F( 2, 27) = 1.83 Model | 52.0714286 2 26.0357143 Prob > F = 0.1798 Residual | 384.228571 27 14.2306878 R-squared = 0.1193 -------------+------------------------------ Adj R-squared = 0.0541 Total | 436.3 29 15.0448276 Root MSE = 3.7724 ------------------------------------------------------------------------------ Y | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- period | 2.571429 1.380542 1.86 0.073 -.2612092 5.404066 trt | -.4285714 1.380542 -0.31 0.759 -3.261209 2.404066 _cons | 15.22857 1.220997 12.47 0.000 12.72329 17.73385 ------------------------------------------------------------------------------ 2. Now fit a marginal model using GEE where we specify an independence working correlation structure. This will provide us with a “robust” estimate of variance. xtgee Y period trt, i(id) corr(ind) Iteration 1: tolerance = 1.477e-15 GEE population-averaged model Number of obs = 30 Group variable: id Number of groups = 15 Link: identity Obs per group: min = 2 Family: Gaussian avg = 2.0 Correlation: independent max = 2 Wald chi2(2) = 4.07 Scale parameter: 12.80762 Prob > chi2 = 0.1310 Pearson chi2(30): 384.23 Deviance = 384.23 Dispersion (Pearson): 12.80762 Dispersion = 12.80762 ------------------------------------------------------------------------------ Y | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- period | 2.571429 1.309697 1.96 0.050 .0044697 5.138387 trt | -.4285714 1.309697 -0.33 0.743 -2.99553 2.138387 _cons | 15.22857 1.15834 13.15 0.000 12.95827 17.49888 ------------------------------------------------------------------------------3. Fit another GEE model but assume an exchangeable correlation structure. In this case, since we only have two observations per person, this is just like estimating the correlation between the two observations and using this information in the model. In general cases where you have more than two observations per person over time, this correlation structure assumes that there is no meaning to the “times” or that all the observations from the same person are exchangeable over time. xtgee Y period trt, i(id) corr(exch) Iteration 1: tolerance = 2.114e-15 GEE population-averaged model Number of obs = 30 Group variable: id Number of groups = 15 Link: identity Obs per group: min = 2 Family: Gaussian avg = 2.0 Correlation: exchangeable max = 2 Wald chi2(2) = 6.89 Scale parameter: 12.80762 Prob > chi2 = 0.0320 ------------------------------------------------------------------------------ Y | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- period | 2.571429 1.006357 2.56 0.011 .5990044 4.543853 trt | -.4285714 1.006357 -0.43 0.670 -2.400996 1.543853 _cons | 15.22857 1.068604 14.25 0.000 13.13415 17.323 ------------------------------------------------------------------------------ After running xtgee, you can obtain the estimate of the correlation using the “xtcorr” command. Below we see that the estimate of the correlation in Y among subjects (within subject correlation) is approximately 0.41. xtcorr Estimated within-id correlation matrix R: c1 c2 r1 1.0000 r2 0.4096 1.00004. Now fit a subject specific random effects model. Here we used the xtreg command, you could also use xtmixed and gllamm. Sometimes the estimation is difficult and the xtreg and xtmixed commands will not provide you with a solution. In those cases, try gllamm. This command uses a different estimation procedure and generally works in cases where xtreg and xtmixed don’t. xtreg Y period trt, re i(id) Random-effects GLS regression Number of obs = 30 Group variable (i): id Number of groups = 15 R-sq: within = 0.3146 Obs per group: min = 2 between = 0.0000 avg = 2.0 overall = 0.1193 max = 2 Random effects u_i ~ Gaussian Wald chi2(2) = 5.97 corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0506 ------------------------------------------------------------------------------ Y | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- period | 2.571429 1.081001 2.38 Unlocking...