CMU ISM 95760 - Lecture 5. Regression Analysis: Basic Theory - D3461867

Home> Schools> Carnegie Mellon University> Information Systems:Sch of IS & Mgt (ISM) > ISM 95760> Lecture 5. Regression Analysis: Basic Theory

DOC PREVIEW

CMU ISM 95760 - Lecture 5. Regression Analysis: Basic Theory

School name Carnegie Mellon University

Course Ism 95760- Decision Making Under Uncertainty

Pages 21

This preview shows page 1-2-20-21 out of 21 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 21 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 21 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 21 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 21 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 21 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Lecture 5. Regression Analysis: Basic TheoryProf. Edson Severnini (CMU) Applied Econometrics I 1 / 21Ordinary Least SquaresIIn Lecture 4 we introduced a regression equation such asYi= α + βXi+ eiIα is the interceptIβ is a slopeIeiis the error term or residualIWe estimate these parameters with a method called ordinaryleast squaresIWhat does that mean?IOur estimators ˆα andˆβ are chosen to minimizePni=1e2iIWe say our estimators minimize the residual sum of squaresIWe’ll show exactly how to do that in a few minutes; first wehave a few concepts to coverProf. Edson Severnini (CMU) Applied Econometrics I 2 / 21Conditional Expectation FunctionsIRecall from the first week of class that E [Yi] is the expectedvalue of the variable YiIIf Yiis related to a dummy variable Diwe defined theconditional expectationIE [Yi|Di= 0]IE [Yi|Di= 1]IYimight similarly be related to a variable that can takeseveral different valuesIE [Yi|Xi= x]IWe refer to the expected value of Yi“given that Xiis someparticular value x”IThe collection of conditional expectations E [Yi|Xi= x] overall possible values of Xiis said to be the conditionalexpectation function (CEF)IThe CEF is denoted simply as E [Yi|Xi]Prof. Edson Severnini (CMU) Applied Econometrics I 3 / 21Conditional Expectation Function: An ExampleRegression 83Figure 2.1The CEF and the regression line7. 27.06.86.66.46.26.05.8Log weekly earnings0 2 4 6 8 10 12 14 16 18 20Years of schoolingNotes: This figure shows the conditional expectation function (CEF) of logweekly wages given years of education, and the line generated by regressinglog weekly wages on years of education (plotted as a broken line).values. We writeE[Yi|X1i,...,XKi]for a CEF with K conditioning variables. With many condi-tioning variables, the CEF is harder to plot, but the idea isthe same. E[Yi|X1i= x1,...,XKi= xK] gives the populationaverage of Yiwith these K other variables held fixed. Insteadof looking at average wages conditional only on schooling, forexample, we might also condition on cells defined by age, race,and sex.Regression and the CEFTable 2.1 illustrates the matchmaking idea by comparing stu-dents who attended public and private colleges, after sortingstudents into cells on the basis of the colleges to which theyapplied and were admitted. The body of the chapter explains From Mastering ‘Metrics: The Path from Cause to Effect. © 2015 Princeton University Press. Used by permission. All rights reserved. Figure: Figure from Abrahamson and CendakProf. Edson Severnini (CMU) Applied Econometrics I 4 / 21Conditional Expectation FunctionsIMore generally we might have a conditional expectation overmany different conditioning variablesE [Yi|X1i= x1, . . . XKi= xK]IThe corresponding CEF is denotedE [Yi|X1i, . . . XKi]IFor example, we might be interested in earnings conditionalon schooling, conditioning also race, sex, family background,and other factorsProf. Edson Severnini (CMU) Applied Econometrics I 5 / 21The Conditional Expectation Function and RegressionIAngrist and Pischke work with the regressionlnYi= α + βPi+150Xj=1γGROUPji+ δ1SATi+ δ2lnPIi+ eiIIf the underlying CEF is linear we can write it out like thisE [lnYi|Pi, GROUPi, SATi, lnPIi]= α + βPi+150Xj=1γGROUPji+ δ1SATi+ δ2lnPIiIRegression is a way of estimating the parameters in a linearCEFProf. Edson Severnini (CMU) Applied Econometrics I 6 / 21The Conditional Expectation Function and RegressionIWhen we have a linear CEF like this oneE [lnYi|Pi, GROUPi, SATi, lnPIi]= α + βPi+150Xj=1γGROUPji+ δ1SATi+ δ2lnPIiregression is an estimation strategy that, as Angrist andPischke sayI“matches students by value of GROUPi, SATi, and lnPIi”I“compares the average earnings of matched students who wentto private (Pi= 1) and public (Pi= 0) schools for eachpossible combination of the conditioning variables”I“produces a single average by averaging all of thesecell-specific contrasts”Prof. Edson Severnini (CMU) Applied Econometrics I 7 / 21The Conditional Expectation Function and RegressionITo see this, recall our regressionE [lnYi|Pi, GROUPi, SATi, lnPIi]= α + βPi+150Xj=1γGROUPji+ δ1SATi+ δ2lnPIiINotice thatE [lnYi|Pi= 1, GROUPi, SATi, lnPIi]− E [lnYi|Pi= 0, GROUPi, SATi, lnPIi] = βIWe are working with a constant treatment assumption, whichmeans that the treatment is the same within each cellIAny weighted average of cell-specific estimates will be anunbiased estimate of βProf. Edson Severnini (CMU) Applied Econometrics I 8 / 21The Conditional Expectation Function and RegressionIIf the CEF is linear, a regression correctly estimates theparameters in the linear functionIIf the CEF is not linear, the regression finds the best linearapproximation of that non-linear functionRegression 83Figure 2.1The CEF and the regression line7. 27.06.86.66.46.26.05.8Log weekly earnings0 2 4 6 8 10 12 14 16 18 20Years of schoolingNotes: This figure shows the conditional expectation function (CEF) of logweekly wages given years of education, and the line generated by regressinglog weekly wages on years of education (plotted as a broken line).values. We writeE[Yi|X1i,...,XKi]for a CEF with K conditioning variables. With many condi-tioning variables, the CEF is harder to plot, but the idea isthe same. E[Yi|X1i= x1,...,XKi= xK] gives the populationaverage of Yiwith these K other variables held fixed. Insteadof looking at average wages conditional only on schooling, forexample, we might also condition on cells defined by age, race,and sex.Regression and the CEFTable 2.1 illustrates the matchmaking idea by comparing stu-dents who attended public and private colleges, after sortingstudents into cells on the basis of the colleges to which theyapplied and were admitted. The body of the chapter explains From Mastering ‘Metrics: The Path from Cause to Effect. © 2015 Princeton University Press. Used by permission. All rights reserved. Figure: Figure from Abrahamson and CendakProf. Edson Severnini (CMU) Applied Econometrics I 9 / 21Ordinary Least Squares (OLS) RegressionILet’s get back to the mechanics of OLSIWe’re going to keep things simple by working with oneregressorYi= α + βXi+ eiIα is the interceptIβ is a slopeIeiis the error term or residualIOLS means we are forming estimators ˆα andˆβ that minimizeresidual sum of squaresRSS =nXi=1e2iProf. Edson Severnini (CMU) Applied Econometrics I 10 / 21Ordinary Least SquaresILet’s try the case where there is just an interceptYi= α + eiINow ordinary least

View Full Document