DOC PREVIEW
UW-Madison STAT 371 - Regression - 4

This preview shows page 1-2-3-4 out of 13 pages.

Save
View full document
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Correlation Plots 2 r 0 97 1 Regression y 0 1 Bret Larget 2 2 1 0 1 2 x Department of Statistics Statistics 371 Fall 2004 2 University of Wisconsin Madison Correlation Plots Correlation December 2 The 2004 correlation coefficient r is measure of the strength of the r 0 21 linear relationship between two variables 10 5 y 0 5 10 n 1 X xi x yi y n 1 i 1 sx sy P xi x yi y qP P xi x 2 yi y 2 r Notice that the correlation is not affected by linear transformations of the data such as changing the scale of measurement 2 1 0 1 2 x Statistics 371 Fall 2004 Statistics 371 Fall 2004 3 Statistics 371 Fall 2004 1 Correlation Plots Correlation Plots r 0 21 2 1 0 1 10 5 y 0 2 1 y 1 0 1 2 4 1 0 1 2 Statistics 371 Fall 2004 3 4 r 1 y Correlation Plots 2 x 1 r 0 0 2 Correlation Plots 2 2 1 x Statistics 371 Fall 2004 6 0 2 Statistics 371 Fall 2004 1 5 x 2 10 y 2 1 0 1 2 r 1 0 1 2 x 7 Statistics 371 Fall 2004 5 Correlation Plots Statistics 371 Fall 2004 10 Simple Linear Regression r 0 97 4 3 2 y 1 The correlation coefficient r measures the strength of the linear relationship between two quantitative variables on a scale from 1 to 1 The correlation coefficient is 1 or 1 only when the data lies perfectly on a line with negative or positive slope respectively If the correlation coefficient is near one this means that the data is tightly clustered around a line with a positive slope Correlation coefficients near 0 indicate weak linear relationships However r does not measure the strength of nonlinear relationships If r 0 rather than X and Y being unrelated it can be the case that they have a strong nonlinear relationshsip If r is close to 1 it may still be the case that a nonlinear relationship is a better description of the data than a linear relationship 0 Summary of Correlation 0 0 0 5 1 0 1 5 2 0 x Statistics 371 Fall 2004 8 Correlation Plots Simple linear regression is the statistical procedure for describing the relationship between an quantitative explanatory variable X and a quantitative response variable Y with a straight line 4 r 0 97 3 In simple linear regression the regression line is the line that minimizes the sum of the squared residuals 2 y 0 1 2 0 1 5 1 0 0 5 0 0 x Statistics 371 Fall 2004 11 Statistics 371 Fall 2004 9 Riley Riley Riley Larget is my son Below is a plot of his height versus his age from birth to 8 years 45 55 50 50 60 70 80 90 45 40 30 40 35 30 Height inches 40 Height inches 50 25 Age months 0 20 40 60 80 Age months Statistics 371 Fall 2004 13 Finding a best linear fit n X 12 Riley Any line we can use to predict Y from X will have the form Y b0 b1X where b0 is the intercept and b1 will be the slope The value y b0 b1x is the predicted value of Y if the explanatory variable X x In simple linear regression the predicted values form a line In more advanced forms of regression we can fit curves or fit functions of multiple explanatory variables For each data point xi yi the residual is the difference between the observed value and the predicted value yi y i Graphically each residual is the positive or negative vertical distance from the point to the line Simple linear regression identifies the line that minimizes the residual sum of squares Statistics 371 Fall 2004 The plot indicates that it is not reasonable to model the relationship between age and height as linear over the entire age range but it is fairly linear from age 2 years to 8 years 24 96 months yi y i 2 i 1 Statistics 371 Fall 2004 14 Statistics 371 Fall 2004 12 The General Case Least Squares Regression Let X x zsx so X is z standard deviations above the mean We won t derive them but there are simple formulas for the slope and intercept of the least squares line as a function of the sample means standard deviations and the correlation coefficient y b0 b1 x zsx y b1x b1x b1zsx y r b0 b1X sy b1 r sx Y sy zsx sx y rz sy b0 y b1x Notice that if X is z SDs above the mean we predict Y to be only rz SDs above the mean In the typical situation r 1 so we predict the value of Y to be closer to the mean in standard units than X This is called the regression effect Statistics 371 Fall 2004 17 Riley cont n length age2 mx mean age2 sx sd age2 my mean height2 sy sd height2 r cor age2 height2 print c mx sx my sy r n 1 61 5625000 21 8661649 45 6718750 b1 r sy sx b0 my b1 mx print c b0 b1 1 30 2493290 0 2505185 15 A Special Case Consider the predicted value of an observation X x 5 4829043 y b0 b1x 0 9990835 16 0000000 y b1x b1x y Riley s predicted height in inches 30 25 0 25 Riley s age in months Statistics 371 Fall 2004 Statistics 371 Fall 2004 18 So the regression line always goes through the point x y Statistics 371 Fall 2004 16 Riley A Residual Plot Using R age2 age age 23 age 97 height2 height age 23 age 97 fit2 lm height2 age2 summary fit2 Call lm formula height2 age2 0 2 Residuals Min 1Q Median 0 29911 0 19291 0 03355 3Q 0 21982 Max 0 46334 Coefficients 0 0 Estimate Std Error t value Pr t Intercept 30 249329 0 186745 161 98 2e 16 age2 0 250519 0 002869 87 33 2e 16 Signif codes 0 0 001 0 01 0 05 0 1 1 0 2 residuals fit2 0 4 plot fitted fit2 residuals fit2 abline h 0 40 45 50 Residual standard error 0 2429 on 14 degrees of freedom Multiple R Squared 0 9982 Adjusted R squared 0 998 F statistic 7627 on 1 and 14 DF p value 2 2e 16 55 fitted fit2 Statistics 371 Fall 2004 21 Riley Interpretation 19 Statistics 371 Fall 2004 Riley Plot of Data We can interpret the slope to mean that from age 2 to 8 years Riley grew an average of about 0 25 inches per year month or about 3 inches per year 45 40 The intercept is the predicted value when X 0 or Riley s height length at birth This interpretation may not be reasonable if 0 is out of the range of the data Height inches 50 30 40 50 60 70 80 90 Age months Statistics 371 …


View Full Document

UW-Madison STAT 371 - Regression - 4

Documents in this Course
HW 4

HW 4

4 pages

NOTES 7

NOTES 7

19 pages

Ch. 6

Ch. 6

24 pages

Ch. 4

Ch. 4

10 pages

Ch. 3

Ch. 3

20 pages

Ch. 2

Ch. 2

28 pages

Ch. 1

Ch. 1

24 pages

Ch. 20

Ch. 20

26 pages

Ch. 19

Ch. 19

18 pages

Ch. 18

Ch. 18

26 pages

Ch. 17

Ch. 17

44 pages

Ch. 16

Ch. 16

38 pages

Ch. 15

Ch. 15

34 pages

Ch. 14

Ch. 14

16 pages

Ch. 13

Ch. 13

16 pages

Ch. 12

Ch. 12

38 pages

Ch. 11

Ch. 11

28 pages

Ch. 10

Ch. 10

40 pages

Ch. 9

Ch. 9

20 pages

Ch. 8

Ch. 8

26 pages

Ch. 7

Ch. 7

26 pages

Load more
Download Regression - 4
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Regression - 4 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Regression - 4 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?