MUSC BMTRY 701 - lecture18 - D856711

Home> Schools> Medical University of South Carolina> (BMTRY) > BMTRY 701> lecture18

MUSC BMTRY 701 - lecture18

School name Medical University of South Carolina

Course Bmtry 701- Biostatistical Methods II

Pages 48

Download Save

Unformatted text preview:

Topics to be coveredReview: Purpose of empirical modelsAssociation StudiesPrediction StudiesReview: Designs for observational studiesReview: Designs for observational studiesReview: ExampleReview: ExampleVery Important ObservationVery Important ObservationRandomization, Stratification and MatchingMultiple Logistic RegressionThe ModelOutput from a typical regression package Example: A two variable model for typical outputExample: Two variable model and typical SAS outputEstimation and Interpretation of ParametersConfounding and InteractionEffect Modification: Example1Effect Modification: Example1Effect Modification: Example2Stratification: Example data in SASStratified Logistic regression in SAS: PooledStratified Logistic regression in SAS: stratifiedConfoundingConfounding: Important factsSAS code for Logistic regression on STD and condom useSAS output for Logistic regressionDetermining ConfoundingEffect of Omitted Variables-I (hypothetical data)Effect of Omitted Variables-IIEffect of Omitted Variables-IIIEffect of Omitted Variables-IVEffect of Omitted Variables-VEffect of Omitted Variables-VIStrategies for Model Building-IStrategies for Model Building-IIStrategies for Model Building-IIIaStrategies for Model Building-IIIbStrategies for Model Building-IVData Example for stepwise, forward and backward methodsSAS code for stepwise, forward and backward methodsSummary of the stepwise methodModel fitting strategies: ExampleModel fitting strategies: ExampleModel fitting strategies: ExampleSummary of the stepwise methodLecture 18: Multiple Logistic RegressionMulugeta Gebregziabher, Ph.D.BMTRY 701/755: Biostatistical Methods II Spring 2007Department of Biostatistics, Bioinformatics and EpidemiologyMedical University of South CarolinaLecture 18: Multiple Logistic Regression – p. 1/48Topics to be covered• Review1. Purpose of empirical models: Association vs Prediction2. Design of observational studies: cross-sectional, prospective, case-control3. Randomization, Stratification and Matching• Multiple logistic regression1. The model2. Estimation and Interpretation of Parameters3. Confounding and Interaction4. Effects of omitted variables5. Model Fitting Strategies6. Goodness of Fit and Model Diagnostics• Matching (group and individual)• Conditional vs Unconditional analysis• Methods III: Advanced Regression MethodsLecture 18: Multiple Logistic Regression – p. 2/48Review: Purpose of empirical modelsEmpirical models: are models that are fitted to provide succinct descriptions of relationshipsobserved in data. They can be of different forms, here we focus on regression models thathave wide applicability• They are data-driven models that provide a range of possible relationships betweenvariables often specified by mathematical convenience and a preference for simplicity.• If the model fits well, inferences are possible about the nature of relationships betweenvariables in the ranges where they are observed (NO extrapolation)• Examples: Association studies in Epidemiology and Prediction studies in clinical or policymaking researchLecture 18: Multiple Logistic Regression – p. 3/48Association Studies• Interest centers on what variables (variables of interest and adjustment variables) arein the model and the size and sign of their coefficients• Predicted value for each observation or model fit is not of interest per seExample 1. After adjusting for appropriate covariates, is broccoli intake associated withcolorectal adenomatous polyps?logit(Pr(polyps)) = β0+ β1energyintake + ... + βkBroccoliintakeExample 2. After adjusting for age, is heart disease (HD) associated with hypertension?logit(Pr(HD)) = β0+ β1Age + β2hypertensionLecture 18: Multiple Logistic Regression – p. 4/48Prediction Studies• Interest centers on being able to accurately estimate or predict the response for agiven combination of predictors• Focus is not much about which predictor variable allow to do this or what theircoefficients are (Model fit is important)Example 1. A multiple logistic regression model for screening diabetes (Tabaei and Herman(2002) in Diabetes Care, 25, 1999-2003)logit(Pr(Diabetes)) =β0+ β1Age + β2Plasmaglucose + β3Postprandialtime + β4Female + β5BMIEstimates:ˆβ0= −10.038,ˆβ1= 0.033,ˆβ2= 0.031,ˆβ3= 0.250,ˆβ4= 0.562,ˆβ5= 0.035They used a cutoff of 20% to predict a previously undiagnosed diabetes with sensitivity=65% andspecificity=96%Lecture 18: Multiple Logistic Regression – p. 5/48Review: Designs for observational studiesWe discuss three important designs that have a lot of use of logistic regression in theiranalysis.Define X to denote an exposure or treatment and D to be an outcome indicator (disease,death, etc).Example: For a binary X and D,CROSS-SECTIONAL DESIGN: randomly select n from a population of N recordsDX D=1 D=0 totalX=1 n11n10n1.X=0 n01n00n0.Total n.1n.0nfixedLecture 18: Multiple Logistic Regression – p. 6/48Review: Designs for observational studiesPROSPECTIVE DESIGN: randomly select n1.from N1with X = 1 and n0.from N0with X = 0DX D=1 D=0 totalX=1 n11n10n1.fixedX=0 n01n00n0.fixedTotal n.1n.0nCASE-CONTROL DESIGN: randomly select n.1from N1cases and n.0from N0controlsDX D=1 D=0 totalX=1 n11n10n1.X=0 n01n00n0.Total n.1fixed n.0fixed nLecture 18: Multiple Logistic Regression – p. 7/48Review: ExampleConsider a hypothetical study of the association between maternal age and birth weightusing data from 1000 hospital delivery records.We can use either of the three designs discussed above.Let X=I(maternal age<=20 yrs) and D=I(birth weight <=2500 g), Where I is an indicatorfunctionCROSS-SECTIONAL DESIGN: randomly select 200 from the 1000 recordsDX D=1 D=0 totalX=1 10 40 50X=0 15 135 150Total 25 175 200Lecture 18: Multiple Logistic Regression – p. 8/48Review: ExamplePROSPECTIVE DESIGN:randomly select a 100 pregnant women age <= 20 and 100 age > 20DX D=1 D=0 totalX=1 20 80 100X=0 10 90 100Total 30 170 200CASE-CONTROL DESIGN: Randomly select 100 infants with birth weight <= 2500g and 100 with birthweight> 2500gDX D=1 D=0 totalX=1 40 23 63X=0 60 77 137Total 100 100 200Lecture 18: Multiple Logistic Regression – p. 9/48Very Important ObservationWe can measure the association between X and D using Ratio of ProportionsP R =P r(D = 1/X = 1)P r(D = 1|X = 0)Or using ratio of OddsOR =P r(D = 1/X = 1)/P r(D = 0/X = 1)P r(D = 1|X = 0)/P r(D = 0/X = 0)=n11∗ n00n01∗ n10Measures of AssociationDesign Pr(D=1/X=1) Pr(D=1/X=0) Pr(X=1/D=1) Pr(X=1/D=0) PR

View Full Document


School:
Email:
New Password:
Confirm Password:

MUSC BMTRY 701 - lecture18

Sign up for free to view:

Please select your school