New version page

CORNELL ECON 3120 - Regression

Type: Lecture Note
Pages: 2
Documents in this Course

This preview shows page 1 out of 2 pages.

View Full Document

End of preview. Want to read all 2 pages?

View Full Document
Unformatted text preview:

Econ 3120 1st Edition Lecture 9Outline of Last Lecture I. Confidence Intervals and T testsOutline of Current Lecture II. RegressionOver the next several lectures we’ll start by examining two variables. Examples: the relationship betweeneducation and wages firm output and production inputs state cigarette taxes and cigarette consumption In these relationships, we define the following: dependent variable the variable to be explained (e.g., education) independent variable: the variable that gives us information on the dependent variable (e.g., wages), also called an explanatory variable We typically denote a dependent variable as y, and an independent variable as x. To describe the relationship between the variables, we define some function: y = f(x) For our purposes, we’ll assume that this relationship is linear: y = β0 +β1x+u where β0 and β1 areparameters which describe this relationship, and u is an error term. Clearly, x is not the only factor that affects y. The variable u represents all of the other factors in determining y. For example, in the schooling-earnings relationship, all of the other factors that influence wages (e.g., experience, ability, location, etc) go into u. 12 Assumptions In order to analyze the relationship between x and y and to derive estimates for β0 and β1, we need to make a few assumptions. We first assume that the unconditional expectation of u is 0: E(u) = 0 We can always make this assumption, because we can simplyre-scale the intercept β0 such that it is true. A key assumption in regression modeling is that x and u are independent, that is, E(u|x) = E(u) = 0 We call this the zero conditional mean assumption. Under this second assumption, we can write the conditional expectation function (CEF) of y as: E(y|x) = E(β0 +β1x+u|x) = β0 +β1x+E(u|x) = β0 +β1x From this equation, we have the primary interpretation of the parameter β1: it represents the slope of the CEF with respect to x. In other words, it represents the change in the expected value of y (conditional on x), with respect to x. 3 Estimating β0 and β1 using ordinary least squares (OLS) regression 3.1 Preliminaries Before we derive OLS estimators, there are two quick things to note: First, note that the sample covariance of x and y is defined as Covˆ (x, y) = 1 n−1 ∑(xi−x¯)(yi −y¯) 2Similar to the sample variance, this can be shown to be an unbiased (and efficient) estimator for the covariance of x and y. Second, we need to know that ∑(xi −x¯)(yi −y¯) = ∑xi(yi −y¯) = ∑yi(xi −x¯) and ∑(xi −x¯)(xi −x¯) = ∑xi(xi −x¯) The proof is straightforward: 3.2 Derivation of OLS Estimates We start with our basic relationship: y = β0 +β1x+u (1) Suppose we have a random sample of observations on x and y denoted as {(xi , yi) : i = 1,...,n}. Since all of the data have the same functional relationship, we can write (1) observation-wise as yi = β0 +β1xi +ui Recall the assumption above: E(u) = 0 (2) We can use use the zero conditional mean assumption above to write Cov(x,u) = E(xu) = 0 (3) We can then substitute u = y−β0 +β1x in equations (2) and (3) to yield: E(y−β0 +β1x) = 0 (4) 3E[x(y−β0 +β1x)] = 0 These notes represent a detailed interpretation of the professor’s lecture. GradeBuddy is best used as a supplement to your own notes, not as a substitute.(5) Define our estimates β0 and β1 as ˆβ0 and ˆβ1. Deriving these estimates involves using the sample analogs of equations (4) and (5): 1 n ∑yi − ˆβ0 − ˆβ1xi = 0 (6) 1 n ∑xi(yi − ˆβ0 − ˆβ1xi) = 0 (7) Equation (6) can be written as: y¯ = ˆβ0 + ˆβ1x¯ ˆβ0 = y¯− ˆβ1x¯ (8) Equation (7) can be written as: ∑xiyi −∑xi ˆβ0 −∑ ˆβ1x 2 i = 0 (9) Substituting ˆβ0 from (8) into (9), we have ∑xiyi −∑xi(y¯− ˆβ1x¯)−∑ ˆβ1x 2 i = 0 ∑xi(yi −y¯) =ˆβ1∑xi(xi −x¯) ˆβ1 = ∑xi(yi −y¯) ∑xi(xi −x¯) = 1 n−1 ∑(xi −x¯)(yi −y¯) 1 n−1 ∑(xi −x¯) 2 = Covˆ (x, y) Varˆ (x) and ˆβ0 = y¯− Covˆ (x, y) Varˆ (x) x¯ 43.3 Predicted (fitted) values Based on the OLS estimates, the predicted values of y are given by yˆi = ˆβ0 + ˆβ1xi In other words, we can predict y for each value of x in our sample. 3.4 Residuals The OLS residual is defined as uˆi = yi −yˆi = yi − ˆβ0 − ˆβ1xi 3.5 Why it’s called Ordinary Least Squares It turns out that the estimators we we derived above for β0 and β1 are the same as those found by taking the minimum sum of squared residuals: ∑uˆ 2 i = ∑(yi − ˆβ0 −

View Full Document Unlocking...