DOC PREVIEW
MUSC BMTRY 701 - lect12

This preview shows page 1-2-23-24 out of 24 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 24 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 24 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 24 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 24 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 24 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Lecture 12 Model BuildingThe real world regressionThe “final model”Principle of ParsimonyGeneral Process of Model BuildingExploratory Data AnalysisSlide 7SENICSENIC exampleWork through the exploration…Next step: Pick an initial modelSlide 12Slide 13Check model assumptionsDoes it fit?Last stepSlide 17Other model building issues: Stepwise approachesIs stepwise ever a good idea?Stepwise ApproachesOther model building issues: R2Other model building issues: Information CriteriaA step furtherWednesday: Diagnostics in MLRLecture 12Model BuildingBMTRY 701Biostatistical Methods IIThe real world regression datasets will have a large number of covariates!There will be a number of covariates to consider for inclusion in the modelThe inclusion/exclusion of covariates•will not always be obvious•will be affected by multicollinearity•will depend on the questions of interest•will depend on the scientific ‘precedents’ in that areaThe model building process is important for determining a “final model”The “final model”At the end of the analytic process, there is generally one model from which you make inferencesit usually is a multiple regression modelit is not logical to make inferences based on more than one modelRecall the ‘principle of parsimony’Principle of ParsimonyAlso known as Occam’s RazorThe principle states that the explanation of any phenomenon should make as few assumptions as possible, eliminating those that make no difference in the observable predictions of the explanatory hypothesis or theory. The principle recommends selecting the hypothesis that introduces the fewest assumptions and postulates the fewest entities Translation for regression: •the fewest possible covariates that explains the greatest variance is best!•The addition of each covariate should be weighed against the increase in complexity of the model.General Process of Model Building1. Exploratory Data analysis2. Choose initial model3. Fit model4. Check model assumptions5. Repeat 2 – 4 as needed6. Interpret findingsExploratory Data AnalysisConsider the covariates and the outcome variables•look at each covariate and outcomewhat forms do they take?might transformations need to be made?•look at relationships between Y and each Xare the relationships linear?what form should a covariate take to enter the model (e.g. categorical? spline? quadratic?)•look at the relatioships between the X’sis there strong correlation between some covariates?Exploratory Data AnalysisIndividual variable analysis•histograms•boxplots•dotplots (by categories?)Two-way associations•scatterplots•color-coded by third variable?•SIMPLE LINEAR REGRESSIONSFor categorical variables•tables•color code other graphical displaysSENICSENIC exampleWe need a scientific question/hypothesis!!Examples:•What factors are predictive of length of stay?•Is the number of beds strongly related to length of stay?•Is there a difference in length of stay by region?•how do infection risk and number of cultures relate to length of stay? is it possible to reduce the length of stay by reducing infection risk and number of cultures?Work through the exploration…Next step: Pick an initial modelUse the information that you learned in the exploratory stepSome guidelines•covariates not associated in SLR models will probably not be associated in MLR models•Choose threshold: alpha < 0.10 or 0.20 in SLR to be included in initial MLRRecall multicollinearity•might want to spend some extra time learning about the interrelationships between two variables and the outcome.Next step: Pick an initial modelMany approaches to the initial modelMy approach: start big, and then pare down•initial model includes all of the covariates and potentially their interactions•fit model with all of the covariates of interest•remove ONE AT A TIME based on insignificant p-values and model coefficientsfind the most insignificant covariaterefit the model without itlook at model: •what happened to other coefficients?•what happened to R2•not hard-fast rules!SENICWhat is an appropriate initial model?Are there any interactions to consider?Work through the model…Check model assumptionsBased on a reasonable model (in terms of ‘significance’ of covariates), check the assumptionsResidual plotsOther diagnosticsRecall your assumptions:•independence of errors•homoscedasticity/constant variance•normality of errorsDoes it fit?If so….go to next stepIf not, deal with misspecifications•transform Y?•another type of regression?!•transform X?•consider more exploration (e.g., smoothers to inform about relationships)•outlier problems?•Then, refit all over again…Last stepInterpret resultsOddly, this step often leads you back to refittingSometimes trying to summarize results causes you to think of additional modeling considerations•adding another variable•using a different parameterization•using a different reference level for a categorical variableSENICWhat is the final model?How to present it?Other model building issues: Stepwise approaches“Stepwise” approaches are computer drivenyou give the computer a set of covariates and it finds an ‘optimal’ model“forward” and “backward”Problems:•models are only ‘stepwise’ optimal•ignore magnitude of β and simply focus on p-value!•you need to set criteria for optimality which are not always obvious•gives you no ability to give different variables different priorities•can have problematic interpretations: e.g. a main effect is removed, but the interaction is included.•stepwise forward and backward give different models.Is stepwise ever a good idea?If you have a very large set of predictors that are somewhat ‘interchangeable’Example: gene expression microarrays•you may have >10000 genes to select from•automated procedures can find optimal set that describe a large amount of variation in the outcome of interest (e.g. cancer vs. no cancer)•it would be physically impossible to use manual model-fitting•Specialized software for this (standard ‘lm’ type approach will not work).Stepwise ApproachesI don’t condone it but,In R: step(reg)Other model building issues: R2Some people use increase in as a criteria of inclusion/exclusion of a covariateNot that common in


View Full Document

MUSC BMTRY 701 - lect12

Documents in this Course
lect3

lect3

38 pages

lect9

lect9

28 pages

lect18

lect18

17 pages

lect1

lect1

51 pages

lect7

lect7

38 pages

lect9

lect9

29 pages

lect11

lect11

25 pages

lect13

lect13

40 pages

lect22

lect22

12 pages

lect10

lect10

40 pages

lect15

lect15

23 pages

lect14

lect14

47 pages

lect13

lect13

32 pages

lect12

lect12

24 pages

lecture18

lecture18

48 pages

lect17

lect17

29 pages

lect4

lect4

50 pages

lect4

lect4

48 pages

lect16

lect16

27 pages

lect8

lect8

20 pages

Load more
Download lect12
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view lect12 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view lect12 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?