9/22/09 Lecture 8 1STOR 155 Introductory StatisticsLecture 8: Least-Squares RegressionThe UNIVERSITY of NORTH CAROLINAat CHAPEL HILL9/22/09 Lecture 8 2Review• Scatter plot:– Association: form, direction, strength– Just graphical, not numerical• Correlation:– Direction, strength, linear– Properties– Vertical and horizontal lines: r =0.• Correlation cannot tell the exact relationship.9/22/09 Lecture 8 3Topics• Least-Squares Regression– Regression lines– Equation and interpretation of the line– Prediction using the line• Correlation and Regression• Coefficient of Determination9/22/09 Lecture 8 4Age vs. Mean Height9/22/09 Lecture 8 5To predict mean height at age 32 months?9/22/09 Lecture 8 6Linear Regression• Correlation measures the direction and strength of the linear relationship between two quantitative variables• A regression line – summarizes the relationship between two variables if the form of the relationship is linear.– describes how a response variable y changes as an explanatory variable x changes.– is often used as a mathematical model to predict the value of a response variable y based on a value of an explanatory variable x.9/22/09 Lecture 8 7Equation of a straight Line• A straight line relating y to x has an equation of the form:y = a+bx– x: explanatory variable– y: response variable– a: y-intercept– b: slope of the line9/22/09 Lecture 8 8How to fit a line?9/22/09 Lecture 8 9Error9/22/09 Lecture 8 10Least-Square Regression Line• A line that makes the sum of the squares of the vertical distances of the data points from the line as small as possible.• Mathematically, the line is determined by minimizingover values of the pair (a, b).2)(iibxay9/22/09 Lecture 8 11Equation of the Least-Squared Regression Line• The least-squares regression line of y on x is• with slope• and interceptbxay ˆxyssrb xbya 9/22/09 Lecture 8 12Interpreting the Regression Line• The slope b tells us that – along the regression line, a change of one standard deviation in x a change of r standard deviations in y.– a change of 1 unit in x b units in y.• The point is always on-line. (why?)• If both x and y are standardized, the slope will be r, the intercept will be 0. – the origin (0, 0) is on-line. (why ?)• r and b have same sign.),( yx9/22/09 Lecture 8 13Example: Age vs. Height7678808284Height (in centimeters)17.5 20 22.5 25 27.5 30Age (in months)85.79,5.23 yx302.2,606.3 yxss9944.0rxyssrb xbya xy 6348.0932.64ˆ9/22/09 Lecture 8 14Prediction • is a prediction when the explanatory variable x=• What is the average height for a child who is 30-month old?• How about a 30-year old?• Do not extrapolate too much for prediction.bxay ˆx9/22/09 Lecture 8 15Correlation and Regression• Both for linear relationship between two variables.– Same sign between b and r.• r does not depend on which is x and which is y.• But a regression line does (causality).9/22/09 Lecture 8 16Regression lines depend on (x,y) or (y,x).9/22/09 Lecture 8 17Coefficient of Determination r2• The square of the correlation, r2, is the proportion of variation in the values of y that is explained by the regression model with x.• 0 r2 1. • The larger r2 , the stronger the linearrelationship.• The closer r2is to 1, the more confident we are in our prediction.9/22/09 Lecture 8 18Age vs. Height: r2 = 0.9888.9/22/09 Lecture 8 19Age vs. Height: r2 = 0.849.9/22/09 Lecture 8 20Take Home Message• Least-Squares Regression– Regression lines– Equation and interpretation of the line– Prediction using the line• Correlation and Regression• Coefficient of
View Full Document