Unformatted text preview:

Slide 1Slide 2Slide 3Slide 4Visualizing Linear RegressionVisualizing Regression Analysis - 1Visualizing Regression Analysis - 2Visualizing Regression Analysis - 3Visualizing Regression Analysis - 4Visualizing Regression Analysis - 5Visualizing Regression Analysis – 6Visualizing Regression Analysis – 7Visualizing Regression Analysis – 8Visualizing Regression Analysis – 9Visualizing Regression Analysis – 10Visualizing Regression Analysis – 11Visualizing Regression Analysis – 12Visualizing Regression Analysis – 13Visualizing Regression Analysis - 14Visualizing Regression Analysis - 15Visualizing Regression Analysis - 16Examples of Residual PlotsSlide 23Slide 24Slide 25Slide 2601/13/19 Slide 1•Linear regression provides additional statistical information about the relationship between two quantitative variables.•The coefficient of determination, R², which indicates the percentage of variance in the dependent variable that is accounted by variability in the independent variable•The regression equation is the formula for the trend or fit line which enables us to predict the dependent variable for any given value of the independent variable•The regression equation has two parts – the intercept and the slope•The intercept is the point on the vertical axis where the regression line crosses. It generally does not provide useful information.01/13/19 Slide 2•The slope is the change in the dependent variable for a one unit change in the independent variable. The slope tells us the direction and magnitude of change.•The regression line represents the predicted value of the dependent variable for each value of the independent variable.•The difference between the predicted values and the actual values of the dependent variable are called residuals. Residuals are the errors that we cannot predict.•Residuals provide us with an important diagnostic tool for determining that linear regression is an appropriate statistical technique for analyzing the relationship between two quantitative variables.01/13/19 Slide 3•Linear regression requires us to satisfy three assumptions about the distributions of the two quantitative variables:•No outliers•A linear relationship between the variables•Equal variance of the residuals across predicted values •The evaluation of the conformity of the analysis to these assumptions is generally based upon visual analysis of the scatterplot of the dependent variable by the independent variable and the “residual plot” – a scatterplot of the residuals on the vertical axis by the predicted values on the horizontal axis.•Numeric results are also available to evaluate each of these assumptions.01/13/19 Slide 4•If we do not satisfy the assumptions, we can:•Report the results, noting the limitations produced by violation of the assumptions•Report the results, ignoring the violations of assumptions, using the argument of robustness to violations•Re-express one or both variables•Omit the outliers•Dichotomize the independent variable, splitting the values at the mean, median, or some other logical value•Simple linear regression refers to analysis with one independent variable.•Multiple regression refers to analysis with more than one independent variablesVisualizing Linear RegressionSW388R6Data Analysis and Computers ISlide 6Visualizing Regression Analysis - 1•While we will base our problem solving on numeric statistical results computed by SPSS, we can use a scatterplot to demonstrate regression graphically.•We will use the variable "highest year of school completed" [educ] as the independent variable and "occupational prestige score" [prestg80] as the dependent variable from the GSS2000R data set to demonstrate the relationship graphically.SW388R6Data Analysis and Computers ISlide 7Visualizing Regression Analysis - 2The dots in the body of the chart represented the cases in the distribution.The independent variable is plotted on the x-axis, or the horizontal axis.The dependent variable is plotted on the y-axis, or the vertical axis.A scatterplot of prestg80 by educ produced by SPSS.SW388R6Data Analysis and Computers ISlide 8Visualizing Regression Analysis - 3I have drawn a green horizontal line through the mean of prestg80 (44.17).NOTE: the plots were created in SPSS by adding features to the default plot. The differences between the mean line and the dots (shown as pink lines), are the deviations. The sum of the squared deviations is the measure of total error when the mean is used as the estimated score for each case.SW388R6Data Analysis and Computers ISlide 9Visualizing Regression Analysis - 4A regression line and the regression equation are added in red to the scatterplot. The pink deviations from the mean have been replaced with the orange deviations from the regression line. Deviations between cases and the regression line are called residuals.SW388R6Data Analysis and Computers ISlide 10Visualizing Regression Analysis - 5The existence of a relationship between the variables is supported when the sum of the squared orange residuals is significantly less than the sum of the squared pink deviationsRecall that both deviations and residuals can be referred to as errors. If there is a relationship, we can characterize it as a reduction in error.SW388R6Data Analysis and Computers ISlide 11Visualizing Regression Analysis – 6While it is difficult for us to square and sum deviations and residuals, SPSS regression output provides us with the answer.The squared sum of the pink deviations from the mean is the Total Sum of Squares in the ANOVA table (49104.91).The squared sum of the orange residuals from the regression line is the Residual Sum of Squares in the ANOVA table (37086.80).SW388R6Data Analysis and Computers ISlide 12Visualizing Regression Analysis – 7The difference between the Total Sum of Squares and the Residual Sum of Squares is the Regression Sum of Squares. The Regression Sum of Squares is the amount of error that can be eliminated by using the regression equation to estimate values of prestg80 instead of the mean of prestg80.The Regression Sum of Squares in the ANOVA table is 12018.11.SW388R6Data Analysis and Computers ISlide 13Visualizing Regression Analysis – 8We can compute the proportion or error that was reduced by the regression by dividing the Regression Sum of Squares by the Total Sum of Squares:12018.11 ÷ 49104.91 = 0.245SW388R6Data Analysis and Computers ISlide 14Visualizing


View Full Document

UT SW 388R - Simple Linear Regression

Download Simple Linear Regression
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Simple Linear Regression and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Simple Linear Regression 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?