2/8/11 Lecture 8 1 STOR 155 Introductory Statistics Lecture 8: Least-Squares Regression The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL2/8/11 Lecture 8 2 Review • Scatter plot: – Association: form, direction, strength – Just graphical, not numerical • Correlation: – Direction, strength, linear – Properties – Vertical and horizontal lines: r =0. • Correlation cannot tell the exact relationship.2/8/11 Lecture 8 3 Topics • Least-Squares Regression – Regression lines – Equation and interpretation of the line – Prediction using the line • Correlation and Regression • Coefficient of Determination2/8/11 Lecture 8 4 Age vs. Mean Height2/8/11 Lecture 8 5 To predict mean height at age 32 months?2/8/11 Lecture 8 6 Linear Regression • Correlation measures the direction and strength of the linear relationship between two quantitative variables • A regression line – summarizes the relationship between two variables if the form of the relationship is linear. – describes how a response variable y changes as an explanatory variable x changes. – is often used as a mathematical model to predict the value of a response variable y based on a value of an explanatory variable x.2/8/11 Lecture 8 7 Equation of a straight Line • A straight line relating y to x has an equation of the form: y = a+bx – x: explanatory variable – y: response variable – a: y-intercept – b: slope of the line2/8/11 Lecture 8 8 How to fit a line?2/8/11 Lecture 8 9 Error2/8/11 Lecture 8 10 Least-Square Regression Line • A line that makes the sum of the squares of the vertical distances of the data points from the line as small as possible. • Mathematically, the line is determined by minimizing over values of the pair (a, b). 2)(iibxay2/8/11 Lecture 8 11 Equation of the Least-Squared Regression Line • The least-squares regression line of y on x is • with slope • and intercept bxay ˆxyssrb xbya 2/8/11 Lecture 8 12 Interpreting the Regression Line • The slope b tells us that – along the regression line, a change of one standard deviation in x a change of r standard deviations in y. – a change of 1 unit in x b units in y. • The point is always on-line. (why?) • If both x and y are standardized, the slope will be r, the intercept will be 0. – the origin (0, 0) is on-line. (why ?) • r and b have same sign. ),( yx2/8/11 Lecture 8 13 Example: Age vs. Height 76 78 80 82 84 Height (in centimeters) 17.5 20 22.5 25 27.5 30 Age (in months) 85.79,5.23 yx302.2,606.3 yxss9944.0rxyssrb xbya xy 6348.0932.64ˆ2/8/11 Lecture 8 14 Prediction • is a prediction when the explanatory variable x= • What is the average height for a child who is 30-month old? • How about a 30-year old? • Do not extrapolate too much for prediction. bxay ˆx2/8/11 Lecture 8 15 Correlation and Regression • Both for linear relationship between two variables. – Same sign between b and r. • r does not depend on which is x and which is y. • But a regression line does (causality).2/8/11 Lecture 8 16 Regression lines depend on (x,y) or (y,x).2/8/11 Lecture 8 17 Coefficient of Determination r2 • The square of the correlation, r2, is the proportion of variation in the values of y that is explained by the regression model with x. • 0 r2 1. • The larger r2 , the stronger the linear relationship. • The closer r2 is to 1, the more confident we are in our prediction.2/8/11 Lecture 8 18 Age vs. Height: r2 = 0.9888.2/8/11 Lecture 8 19 Age vs. Height: r2 = 0.849.2/8/11 Lecture 8 20 Take Home Message • Least-Squares Regression – Regression lines – Equation and interpretation of the line – Prediction using the line • Correlation and Regression • Coefficient of
View Full Document