DOC PREVIEW
SCC GBS 221 - Chapter 12 Simple Linear Regression

This preview shows page 1-2-3-4 out of 11 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1Chapter 12Simple Linear Regression2Introductionn Exam Score vs. Hours Studied Scenarion Regression Analysisn used to quantify the relation between 2 (or more) variables so you can predict the value of one variable based on the value of anothern develop an equation to predict the value of a dependent variable based on the value of one or more independent variablesn Correlation Analysisn measures the strength of linear relation between a pair of variablesn if you plan to predict Y from X, they ought to be related!23Simple vs. Multiple Regressionn Simple Regression Analysisn use a single independent variable to predict the dependent variablen estimated Score = 40.0816 + 1.4966(Hours)n r2= .7432n Multiple Regression Analysisn use multiple independent variables to predict the dependent variablen the set of independent variables should be independent of one anotherand each should be highly related to the dependent variablen estimated Score = 33.914 +3.472(GPA) -1.698(Absences) +1.395(Hours)n r2= .76544Characterizing Relationshipsn Direct Relationn line of best fit has positiveslopen Inverse Relationn line of best fit has negativeslopen Deterministic (Functional) Relationn “100% pure” relation between the pair of variablesn there is no scatter with respect to line of best fit, so the value of Y can be determined exactly (without error) based on value of Xn Stochastic (Statistical, Random) Relationn a “less than perfect” relation between the pair of variablesn since variables other than X impact Y, there is scatterwith respect to line of best fit and there will be error when use x to predict yn How characterize the apparent relation between Exam Score and Hours Studied?35Simple Linear Regression Modeln Population Linear Regression Equationn ε represents the combined effects of other variables and is assumed to have mean of 0 and variance of σ2n Sample Linear Regression Equatione +x ß+ß =y 10xb+b =yˆ106Least Squares Method: Line Of Best Fitn Provides the best fitting line in the sense that it has the minimum amount of squared deviation between each observed value and the corresponding point on the regression linen Minimizes the sum of squared residuals in order to:n prevent (+) and (-) errors from cancellingn draws added attention to any large errorsn prefers to make several small errors in order to avoid large errorsn The sample regression line won't perfectly fit the sample points… there will be errors in fit. Why?)yˆ-y( = residualfitinerror=47Least Squares Method: Line Of Best Fitn Exam Score vs. Hrs Studiedn the sample regression equation is: ______________________n compute the predicted valuesn compute the residuals and squared residuals 21)x-(x)y-)(yx-(xbslope∑∑==xb - y = b =intercept y10n Properties of the Least Squares regression equationn 1) b0and b1are unbiased estimators of ß0and ß1n 2) line passes through the pointn 3) the sum of the residuals is zeron 4) the sum of the squaredresiduals is minimized∑= 0)yˆ-y(minimum)yˆ-y( 2=∑()y,x8Conditional Distribution Of yn Figure 12.8 on page 511n Why is y variable at any given value x?n Distribution of y is assumed Normal with mean =yˆn The regression equation is the line which connects the mean value of y at each value of x59Correlation Analysis Conceptsn Measures the strength of linear relationbetween two variablesn If you intend to use X to predict Y, how strongly related are they?n The slope of the sample regression equation was +1.4965 so these variables seem to “move together”n The mean exam score was 76 and variation among student scores was s=11.2504n some of the variation in scores can be explained by taking into account hours studied10Strength of Relationshipr=.98, r2=.96r=.78, r2=.61r=.34, r2=.12r=.12, r2=.01r=-.01, r2=.00r=-.11, r2=.01r=-.33, r2=.12r=-.64, r2=.41r=-.99, r2=.986113545556575859505101520253035Correlation AnalysisTOTAL VARIATION = EXPLAINABLE BY + UNEXPLAINABLEIN SCORES HOURS STUDIED BY HOURS STUDIED SSE SSR SST+=222)yˆ(y )yyˆ( )y(y −∑+−∑=−∑76=y(92-76) (88-76) (92-88)= +12Correlation Analysisn Exam Score vs. Hours Studiedn SST = __________ SSR = __________ SSE = __________TOTAL VARIATION = EXPLAINABLE BY + UNEXPLAINABLEIN SCORES HOURS STUDIED BY HOURS STUDIED SSE SSR SST+=222)yˆ(y )yyˆ( )y(y −∑+−∑=−∑713Coefficient Of Determinationn Measures the proportionof variation in variable y that is explained by variable xn Indicates how well the sample regression line fits the sample datan ρ2estimated by r2n 0 < r2< 1∑∑−−222)y(y)yyˆ(=SSTSSR = variationtotal variationexplained = rn Exam Score vs. Hrs Studied14Coefficient Of Correlationn ρ estimated by rn -1 < r < +1r)b of(sign =r 21n Interpretation: There is a (strength) (direct or inverse) correlation between (variable X) and (variable Y)n Exam Score vs. Hrs StudiedValue of rStrength of correlation.9 to 1 very high.7 to .9 high.5 to .7 moderate.3 to .5 weak.0 to .3little if any815Coefficient Of Correlationn When working with multiple variables, common to obtain the correlation between each pair of variablesn a triangular correlation matrixn Can investigate whether or not the potential independent variables are truly independent of one anotherScore Hours GPAHours 0.862GPA 0.489 0.566Absences -0.343 -0.234 0.02816Limitations Of Regression Analysisn Regression/Correlation cannot prove cause-and-effect relationshipsn Brightman articlen Don't use the regression model to predict beyond range of observed X-values917Mean Square Error & Standard Error of Estimaten Measures amount of scatter around the regression linen Serves as an estimate of σ22-n)yˆ(y = 2nSSE = M.S.E.2∑−−n Standard Error of Estimaten Square root of MSEn Serves as an estimate of σn Used for inference regarding the regression linen hypothesis testsn interval estimatesn Exam Score vs. Hrs Studied2-n)yˆ(y = 2nSSE = s2est∑−−18t-Test for Significance of the Slopen b1estimates ß1n H0: ß1= 0 no relation between the two variablesn HA: ß1?0 is a relation between the two variablesn test statistic = b1whose sampling distr follows tn-2n Standard Error of the Slopen measures ROSE when use b1to estimate ß1n∑∑−=−2est2b)x(xs)x(xM.S.E. = s1n Exam Score vs. Hrs Studied1019Interval Estimation In Regression Analysisn What score would you predict for students who study 30 hours?n We’ve


View Full Document

SCC GBS 221 - Chapter 12 Simple Linear Regression

Download Chapter 12 Simple Linear Regression
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Chapter 12 Simple Linear Regression and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Chapter 12 Simple Linear Regression 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?