# UCLA STATS 101A - stats 101A notes (92 pages)

Previewing pages*1, 2, 3, 4, 5, 6, 43, 44, 45, 46, 47, 48, 87, 88, 89, 90, 91, 92*of 92 page document

**View the full content.**## stats 101A notes

Previewing pages
*1, 2, 3, 4, 5, 6, 43, 44, 45, 46, 47, 48, 87, 88, 89, 90, 91, 92*
of
actual document.

**View the full content.**View Full Document

## stats 101A notes

0 0 338 views

- Pages:
- 92
- School:
- University of California, Los Angeles
- Course:
- Stats 101a - Introduction to Design and Analysis of Experiment

**Unformatted text preview:**

Learning Objectives After careful study of this chapter you should be able to do the following 1 Use simple linear regression for building empirical models to engineering and scientific data 2 Understand how the method of least squares is used to estimate the parameters in a linear regression model 3 Analyze residuals to determine if the regression model is an adequate fit to the data or to see if any underlying assumptions are violated 4 Test the statistical hypotheses and construct confidence intervals on the regression model parameters 5 Use the regression model to make a prediction of a future observation and construct an appropriate prediction interval on the future observation 6 Apply the correlation model 7 Use simple transformations to achieve a linear regression model 1 Spurious Correlations 2 3 4 5 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables Regression analysis is a statistical technique that is very useful for these types of problems For example in a chemical process suppose that the yield of the product is related to the process operating temperature Regression analysis can be used to build a model to predict yield at a given temperature level 6 Empirical Models 7 Empirical Models 8 Empirical Models Based on the scatter diagram it is probably reasonable to assume that the mean of the random variable Y is related to x by the following straight line relationship where the slope and intercept of the line are called regression coefficients The simple linear regression model is given by where is the random error term 9 Empirical Models We think of the regression model as an empirical model Suppose that the mean and variance of are 0 and 2 respectively then The variance of Y given x is 10 Empirical Models The true regression model is a line of mean values where 1 can be interpreted as the change in the mean of Y for a unit change in x Also the variability of Y at a particular value of x is determined by the error variance 2 This implies there is a distribution of Y values at each x and that the variance of this distribution is the same at each x 11 Empirical Models Figure 11 2 The distribution of Y for a given value of x for the oxygen purityhydrocarbon data 12 Simple Linear Regression The case of simple linear regression considers a single regressor or predictor x and a dependent or response variable Y The expected value of Y at each level of x is a random variable We assume that each observation Y can be described by the model 13 Simple Linear Regression Suppose that we have n pairs of observations x1 y1 x2 y2 xn yn Figure 11 3 Deviations of the data from the estimated regression model 14 Simple Linear Regression The method of least squares is used to estimate the parameters 0 and 1 by minimizing the sum of the squares of the vertical deviations in Figure 11 3 15 Simple Linear Regression Equation 11 2 the n observations in the sample can be expressed as Using The sum of the squares of the deviations of the observations from the true regression line is 16 Simple Linear Regression 17 Simple Linear Regression 18 Simple Linear Regression Definition 19 Simple Linear Regression 20 Simple Linear Regression Notation 21 Simple Linear Regression Example 11 1 22 Simple Linear Regression Example 11 1 23 Simple Linear Regression Example 11 1 24 Simple Linear Regression Example 11 1 25 26 Simple Linear Regression Estimating 2 The error sum of squares is It can be shown that the expected value of the error sum of squares is E SSE n 2 2 27 Simple Linear Regression Estimating 2 An unbiased estimator of 2 is where SSE can be easily computed using 28 Properties of the Least Squares Estimators Slope Properties Intercept Properties 29 Hypothesis Tests in Simple Linear Regression Use of t Tests Suppose we wish to test An appropriate test statistic would be 30 Hypothesis Tests in Simple Linear Regression Use of t Tests The test statistic could also be written as We would reject the null hypothesis if 31 Hypothesis Tests in Simple Linear Regression Use of t Tests Suppose we wish to test An appropriate test statistic would be 32 Hypothesis Tests in Simple Linear Regression Use of t Tests We would reject the null hypothesis if 33 Hypothesis Tests in Simple Linear Regression Use of t Tests An important special case of the hypotheses of Equation 11 18 is These hypotheses relate to the significance of regression Failure to reject H0 is equivalent to concluding that there is no linear relationship between x and Y 34 Hypothesis Tests in Simple Linear Regression The hypothesis H0 1 0 is not rejected 35 Hypothesis Tests in Simple Linear Regression The hypothesis H0 1 0 is rejected 36 Hypothesis Tests in Simple Linear Regression Example 37 The difference between the 2 predictions is the additional information explained by the Hand Length explained deviation Total Deviation Unexplained Deviation Explained Deviation y y T D y y U D y y E D Total Deviation Unexplained Deviation Explained Deviation Total Variation Unexplained Variation Explained Variation Which is computed as follows Total Sum of Square Error Sum of Squares Regression Sum of Squares 2 y y 2 y y SS Total SS Error 2 y y SS Regression R2 Coefficient of Determination Variation Explained by the X variable SS due to Regression R2 the proportion of variation explained by X variable Variation due to Error Sum of squared Residuals SS Total SS Error SS Regression 1 SS Total SS Total SS Total SS Regression SS Error 2 R 100 1 100 SS Total SS Total R 2 is called the coefficient of determination which is the the percent of variation that the predictor variable can explain the response variable The coefficient of determination R2 is the of variation in the response variable that is explained by variation in the predictor variable R2 SS Regression SS Total 1 SS Error SS Total To determine R2 for the linear regression model simply square the value of the linear correlation coefficient We can also use r2 100 NOTE The method does not work for regression equations that have more than 1 predictor variable Hypothesis Tests in Simple Linear Regression Analysis of Variance Approach to Test Significance of Regression The analysis of variance identity is Symbolically 42 Hypothesis Tests in Simple Linear Regression Analysis of Variance Approach to Test Significance of Regression If the null hypothesis H0 1 0 is true the statistic follows the F1 n 2 distribution and we would reject if f0 f

View Full Document