## Regression

Regression

- Lecture number:
- 9
- Pages:
- 2
- Type:
- Lecture Note
- School:
- Cornell University
- Course:
- Econ 3120 - Applied Econometrics
- Edition:
- 1

**Unformatted text preview: **

Lecture 9 Outline of Last Lecture I. Confidence Intervals and T tests Outline of Current Lecture II. Regression Over the next several lectures we’ll start by examining two variables. Examples: the relationship between education and wages firm output and production inputs state cigarette taxes and cigarette consumption In these relationships, we define the following: dependent variable the variable to be explained (e.g., education) independent variable: the variable that gives us information on the dependent variable (e.g., wages), also called an explanatory variable We typically denote a dependent variable as y, and an independent variable as x. To describe the relationship between the variables, we define some function: y = f(x) For our purposes, we’ll assume that this relationship is linear: y = β0 +β1x+u where β0 and β1 are parameters which describe this relationship, and u is an error term. Clearly, x is not the only factor that affects y. The variable u represents all of the other factors in determining y. For example, in the schooling-earnings relationship, all of the other factors that influence wages (e.g., experience, ability, location, etc) go into u. 12 Assumptions In order to analyze the relationship between x and y and to derive estimates for β0 and β1, we need to make a few assumptions. We first assume that the unconditional expectation of u is 0: E(u) = 0 We can always make this assumption, because we can simply re-scale the intercept β0 such that it is true. A key assumption in regression modeling is that x and u are independent, that is, E(u|x) = E(u) = 0 We call this the zero conditional mean assumption. Under this second assumption, we can write the conditional expectation function (CEF) of y as: E(y|x) = E(β0 +β1x+u|x) = β0 +β1x+E(u|x) = β0 +β1x From this equation, we have the primary interpretation of the parameter β1: it represents the slope of the CEF with respect to x. In other words, it represents the change in the expected value of y (conditional on x), with respect to x. 3 Estimating β0 and β1 using ordinary least squares (OLS) regression 3.1 Preliminaries Before we derive OLS estimators, there are two quick things to note: First, note that the sample covariance of x and y is defined as Covˆ (x, y) = 1 n−1 ∑(xi −x¯)(yi −y¯) 2Similar to the sample variance, this can be shown to be an unbiased (and efficient) estimator for the covariance of x and y. Second, we need to know that ∑(xi −x¯)(yi −y¯) = ∑xi(yi −y¯) = ∑yi(xi −x¯) and ∑(xi −x¯)(xi −x¯) = ∑xi(xi −x¯) The proof is straightforward: 3.2 Derivation of OLS Estimates We start with our basic relationship: y = β0 +β1x+u (1) Suppose we have a random sample of observations on x and y denoted as {(xi , yi) : i = 1,...,n}. Since all of the data have the same functional relationship, we can write (1) observation-wise as yi = β0 +β1xi +ui Recall the assumption above: E(u) = ...

View Full Document