Residual plotStructure less noise is goodIf there is a pattern that’s bad.Funnel is a problem as well.Tells you the appropriateness off residualOutliers have a disproportionate impact on regression lines. Regression lines are very sensitive to outliers.Basic cautions:Correlation is for two relationships. Regression is for one way.Extrapolation:Plugging in x values not related to data.24 beers = BAC .4184 aka deadlinear relationship is not relevant…not accurate?Linear relationship may not be validLurking variablesVariables that you haven’t measured that are influencing your relationshipsExample: Red wine = better levels of overall healthLurking variables: Income, other lifestyle (going to gym), etcCausationBAC: weight, gender, etc?Association is not causationExamining relationships between categorical variables [before was quantitative variables]Mainly done through tableRow variableColumn variableMarginal distributionsColumn/row totalsDistribution of individual variables [row and column]Conditional distributionsRelationships between categorical variablesTables, bargrahs, etc are key for this Residual plot- Structure less noise is good- If there is a pattern that’s bad. - Funnel is a problem as well.- Tells you the appropriateness off residual Outliers have a disproportionate impact on regression lines. Regression lines are very sensitive to outliers. Basic cautions: Correlation is for two relationships. Regression is for one way. Extrapolation:- Plugging in x values not related to data.o 24 beers = BAC .4184 aka dead o linear relationship is not relevant…not accurate?o Linear relationship may not be valid Lurking variables- Variables that you haven’t measured that are influencing yourrelationships- Example: Red wine = better levels of overall healtho Lurking variables: Income, other lifestyle (going to gym), etc Causation - BAC: weight, gender, etc- ?- Association is not causation Examining relationships between categorical variables [before was quantitative variables] Mainly done through table- Row variable- Column variable Marginal distributions- Column/row totals- Distribution of individual variables [row and column]Conditional distributions- Relationships between categorical variables- Tables, bargrahs, etc are key for
View Full Document