15-1©2006 Raj JainCSE567MWashington University in St. LouisOther Regression Other Regression ModelsModelsRaj Jain Washington University in Saint LouisSaint Louis, MO [email protected] slides are available on-line at:http://www.cse.wustl.edu/~jain/cse567-06/15-2©2006 Raj JainCSE567MWashington University in St. LouisOverviewOverview1. Multiple Linear Regression: More than one predictor variables2. Categorical Predictors: Predictor variables are categories such as CPU type, disk type, and so on.3. Curvilinear Regression: Relationship is nonlinear4. Transformations: Errors are not normally distributed or the variance is not homogeneous5. Outliers6. Common mistakes in regression15-3©2006 Raj JainCSE567MWashington University in St. LouisMultiple Linear Regression ModelsMultiple Linear Regression Models! Given a sample of n observations with k predictors15-4©2006 Raj JainCSE567MWashington University in St. LouisVector NotationVector NotationIn vector notation, we have:!! or! All elements in the first column of X are 1. See Box 15.1 for regression formulas.15-5©2006 Raj JainCSE567MWashington University in St. LouisExample 15.1Example 15.1! Seven programs were monitored to observe their resource demands. In particular, the number of disk I/O's, memory size (in kBytes), and CPU time (in milliseconds) were observed.15-6©2006 Raj JainCSE567MWashington University in St. LouisExample 15.1 (Cont)Example 15.1 (Cont)! In this case:15-7©2006 Raj JainCSE567MWashington University in St. LouisExample 15.1 (Cont)Example 15.1 (Cont)! The regression parameters are:! The regression equation is:15-8©2006 Raj JainCSE567MWashington University in St. LouisExample 15.1 (Cont)Example 15.1 (Cont)! From the table we see that SSE is:15-9©2006 Raj JainCSE567MWashington University in St. LouisExample 15.1 (Cont)Example 15.1 (Cont)! An alternate method to compute SSE is to use:! For this data, SSY and SS0 are:! Therefore, SST and SSR are:15-10©2006 Raj JainCSE567MWashington University in St. LouisExample 15.1 (Cont)Example 15.1 (Cont)! The coefficient of determination R2is:! Thus, the regression explains 97% of the variation of y. ! Coefficient of multiple correlation:! Standard deviation of errors is:15-11©2006 Raj JainCSE567MWashington University in St. LouisExample 15.1 (Cont)Example 15.1 (Cont)! Standard deviations of the regression parameters are:! The 90% t-value at 4 degrees of freedom is 2.132.None of the three parameters is significant at a 90% confidence level.15-12©2006 Raj JainCSE567MWashington University in St. LouisExample 15.1 (Cont)Example 15.1 (Cont)! A single future observation for programs with 100 disk I/O's and a memory size of 550:! Standard deviation of the predicted observation is:! 90% confidence interval using the t value of 2.132 is:15-13©2006 Raj JainCSE567MWashington University in St. LouisExample 15.1 (Cont)Example 15.1 (Cont)! Standard deviation for a mean of large number of observations is:! 90% confidence interval is:15-14©2006 Raj JainCSE567MWashington University in St. LouisAnalysis of VarianceAnalysis of Variance! Test the hypothesis that SSR is less than or equal to SSE. ! Degrees of freedom = Number of independent values required to compute! Assuming " Errors are i.i.d. Normal ⇒ y's are also normally distributed, " x's are nonstochastic ⇒ Can be measured without errors ! Various sums of squares have a chi-square distribution with the degrees of freedom as given above.15-15©2006 Raj JainCSE567MWashington University in St. LouisFF--Test Test ! Given SSi and SSj with νiand νjdegrees of freedom, the ratio (SSi/νi)/(SSj/νj) has an F distribution with νinumerator degrees of freedom and νjdenominator degrees of freedom. ! Hypothesis that the sum SSi is less than or equal to SSj is rejected at α significance level, if the ratio (SSi/νi)/(SSj/νj) is greater than the 1-α quantile of the F-variate. ! This procedure is also known as F-test.! The F-test can be used to check: Is SSR is significantly higher than SSE? ⇒ Use F-test ⇒ Compute (SSR/νR)/(SSE/νe) = MSR/MSE15-16©2006 Raj JainCSE567MWashington University in St. LouisFF--Test (Cont)Test (Cont)! MSE = Variance of Error! MSR/MSE has F[k, n-k-1] distribution ! F-test = Null hypothesis that y doesn't depend upon any xj: against an alternate hypothesis that y depends upon at least onexjand therefore, at least one bj≠ 0.! If the computed ratio is less than the value read from the table, the null hypothesis cannot be rejected at the stated significance level. ! In simple regression models, If the confidence interval of b1does not include zero ⇒ Parameter is nonzero ⇒ Regression explains a significant part of the response variation ⇒ F-test is not required. and15-17©2006 Raj JainCSE567MWashington University in St. LouisANOVA Table for Multiple Linear RegressionANOVA Table for Multiple Linear Regression! See Table 15.3 on page 25215-18©2006 Raj JainCSE567MWashington University in St. LouisExample 15.2Example 15.2! For the Disk-Memory-CPU data of Example15.1! Computed F ratio > F value from the table ⇒ Regression does explain a significant part of the variation! Note: Regression passed the F test ⇒ Hypothesis of all parameters being zero cannot be accepted. However, none of the regression parameters are significantly different from zero.This contradiction ⇒ Problem of multicollinearity15-19©2006 Raj JainCSE567MWashington University in St. LouisProblem of MulticollinearityProblem of Multicollinearity! Two lines are said to be collinear if they have the same slope and same intercept. ! These two lines can be represented in just one dimension instead of the two dimensions required for lines which are not collinear. ! Two collinear lines are not independent. ! When two predictor variables are linearly dependent, they are called collinear. ! Collinear predictors ⇒ Problem of multicollinearity ⇒ Contradictory results from various significance tests.! High Correlation ⇒ Eliminate one variable and check if significance improves15-20©2006 Raj JainCSE567MWashington University in St. LouisExample 15.3Example 15.3! For the data of Example 15.2, n=7, Σ x1i=271, Σ x2i=1324, Σ x1i2=1385, Σ x2i2=326,686, Σ x1ix2i=67,188.! Correlation is high ⇒ Programs with large memory sizes have more I/O's ! In Example14.1, CPU time on number of disk I/O's regression was found significant.15-21©2006 Raj JainCSE567MWashington University in St. LouisExample 15.3 (Cont)Example
View Full Document