Chapter 8 Exam 3 Chapter 8 9 10 Are our models typically additive or multiplicative o Typically additive Y 1X1 2X2 o Some independent variables X1 X2 cause changes in some dependent variable Y What are the different parts of a bivariate model alpha beta etc o Y X o Y Dependent Variable o X Independent Variable o Constant o Coefficient o Y constant Coefficient X o Alpha constant Also y intercept o Beta Coefficient slope A one unit change in X results in a change in the expected value of Y What does the constant represent The coefficient How do we interpret a coefficient Regression tells us how much it is related Correlation Coefficient tells us that they re related but not by how much Regression does What does O L S stand for What are the OLS assumptions for our dependent variable What is linearity assumption Independent observation assumption and how this influences standard errors o O L S This method of minimizing the sum of the squared errors is called ordinary least squares O L S o We may use O L S if Our dependent variable is continuous and unbounded Our dependent variable is normally distributed o Linearity Assumption A straight line adequately represents the relationship in the population Assumes that the relationship is linear Fitting a linear model to nonlinear relationship results in biased estimates o Independent Observation assumption More specifically the values of the dependent variable are independent of each other Time series data panel data and clustered data often do not satisfy this condition The estimates are unbiased but the standard errors are typically biased downwards May seem lower than it is This means we re more likely to mistakenly reject the null hypothesis Do we typically care more about our alpha or beta estimates o However for our purposes we are less concerned about o is not directed related to our hypothesis test o Usually is important for calculating predictions of the dependent variable but that s not our goal o Our goal is to determine if the independent variable affects the dependent variable o And that s related to the uncertainty around How do we read a regression table How do we test for statistical significance of a coefficient How do we calculate degrees of freedom o Regression Table o Statistical Significance t Coefficient Standard Error of Coef o Degrees of Freedom d f n of parameters 1 Constant 2 of Coefficients For bivariate regression d f n 2 What is the null hypothesis o Non Directional Hypothesis o Directional Hypothesis How do we calculate predicted values direction The null hypothesis is 0 There is no slope or relationship The null hypothesis is either that 0 or that the relationship is in the other o The expected value of the dependent variable given the specified value of the independent variable Fill in the alpha beta and X values to figure out Y in the Y a BX equation o We can use predicted values to help understand the size of the effects o Y X fill it in with the predicted Alpha and beta values Then multiple beta by X pick a value or two Like highest and lowest point of X o Pick two values of interest George H W Bush o Y alpha 36 58 beta 0 255 80 X 56 98 predicted value of Y D V o Y 36 58 0 255 35 45 5 Highest Approval 80 Lowest Approval 35 o So one of the largest popularity swings changes partisanship by about 11 5 according to our model o Remember there is uncertainty around these predictions What are outliers How do they affect leverage and influence o When a case has an unusual Y value given its X value o Leverage When a case has an unusual X value Leverage is not always bad o Influence A case that is both an outlier and has leverage is said to influence the regression line It effects both the constant and the slope It is both an odd Y AND has an odd X value What are residuals o The difference between the actual value and the predicted value is called the residual o Ui Yi Yi o Find it by plugging in all data points into this equation Y alpha betaX Find the predicted values for all points by doing the above After all data points are plugged in Subtract the predicted amounts from the actual Y values Yi Yi residuals o Smaller residuals the better the fit is What is R squared and Root Mean Squared Error MSE o R 2 is the proportion of the variance in the dependent variable that our model explains Another good fit test Therefore it ranges from 0 1 The closer our R 2 is to 1 the more of the variation our model explains The closer our R 2is to 1 the better our model is at predicting the dependent o R 2 Model Sum of Squares Total Sum of Squares Deviations from the mean predicted by our model over the total deviations from variable the mean o Root Mean Squared Error MSE The root Mean Squared Error is a measure of the typical deviations from the regression line Measuring the residuals and how well the model predicted the dependent variable K is equal to the number of parameters For bivariate regression k 2 This is different from the book Remember ui 2 are the residuals therefore The Yi is the predicted Y value Yi Actual Y value On average the model we estimated is off by 1 018 02 in predicting the per capita income in a state o Is that good o root MSE is in the metric of the dependent variable o The dependent variable ranges from about 24 000 30 000 Is a difference of 1 018 02 small given that range o The standard deviation of the dependent variable is 2 692 04 Chapter 9 How does multiple regression help us deal with spuriousness o can we infer that our independent variable causes our dependent variable We might have multiple hypotheses o To infer causation we must rule out alternative explanations o Spuriousness Is there another factor that you re not considering What are the different parts of a multiple regression model x z y o Y 1X 2Z Y Dependent Variable X Independent Variable Z Controlling for Spuriousness o Both X and Z are independent variables o Instead of a line through points it s drawing a plane through the points in multiple dimensions o Each in the equation tells us the partial effect of each different independent variable o Perform separate t tests to test for statistical significance of each independent variable How do we interpret a coefficient in a multiple regression model How is this different from a bivariate regression o Interpreting Coefficient in multiple regression The key to note is that the in the bivariate regression model will be different from 1 in the multiple regression model o It will be very
View Full Document