Unformatted text preview:

POS 3713 Exam 4 Study Guide study guide Includes book notes and PowerPoint notes It includes all of the information featured on the r measures consistency and direction of association not the slope The basic idea of two variable regression is that we are fitting the best line through a o This line is defined by its slope and y intercept and serves as the statistical model scatter plot of data of reality The formula for a line is Y mX b b y intercept o b is the predicted value of Y when X 0 m slope o For a one unit increase in X m is the corresponding rise in Y Together b and m are described as the line s parameters The formula for a population regression model is Population regression model a theoretical formulation of the proposed linear relationship between at least one independent variable and a dependent variable o y intercept parameter o B slope parameter o the stochastic random component of our dependent variable We have this term because we do not expect all of our data points to line up perfectly on a straight line A world with ui A world with ui Population Regression Model Sample Regression Model The values that define the line are the estimated systematic components of Y o Y and X do not get hates because they are values for cases in the population that ended up in the sample They are measured rather than estimated We use them to estimate a B and o In two variable regression we use information from this model to make inferences about the unseen population regression model o To distinguish between these two we place hates over terms in the sample regression model that are estimates of terms from the unseen population regression model o Because they have hates we can describe a and B as parameter estimates For the X value we use a and B to calculate the predicted value of y which we call o This can also be written in terms of expectations o o This means that the expected value of Y given or is equal to our formula for the two variable regression line o We can now talk about each as having an estimated systematic component and an estimated stochastic component residual we can this write our model as o And we can rewrite this in terms of to get a better understanding of the estimated stochastic component o From this formula we can see that the estimated stochastic component is equal to the difference between the actual value of the DV and the predicted value of the DV from our two variable regression model Which Line Fits Best In the above figure the data has a general pattern from lower left to upper right therefore we know that our slope will be positive So how do we decide which line best fits the data that we have Because we are interested in explaining our dependent variable we want our residual values which are vertical distances between each Y and the corresponding Y residual to be as small as possible o But because these vertical distances come in both positive and negative values we cannot just add them up for each line and have a good summary of the fit between each line and our data Methods for assessing the fit of each line in which the positive and negative residuals do not cancel each other out o 1 add together the absolute value of the residuals for each line o 2 add together the each line squared value of each the residuals for o With either choice we smallest total value want to choose the line that has the Measures of total residuals for three different lines Although the absolute value calculation is just as valid as the squared residual calculation statisticians prefer the squared residual Ordinary Least Squares regression models Ordinary Least Squares Test the most popular method for computing sample Numerator of B equation same as covariance between X and Y Denominator of B equation sum of squared deviations of Xi value from mean value of X Thus for a given covariance between X and Y the more less spread out X is the less more steep the estimated slope of the regression line One of the mathematical properties of OLS regression is that the line produced by the parameter estimates goes through the sample mean values of X and Y What does this data mean 51 51 0 62 Our sample regression line formula is Y 51 51 0 62X Y incumbent party s share of the major party vote X real per capita growth in GDP Therefore o If our measure of growth equals zero we would expect the incumbent party to obtain 51 51 of the vote o If the growth is not equal to zero we multiply the value of growth by 0 62 and add or subtract if the growth is negative the result to 51 51 to obtain our best guess of the value of vote o Moving to the right left along our line means that we are increasing decreasing the value of growth For each right left movement we see a corresponding rise or decline in the value of the expected value of incumbent vote Our estimated slope parameter answers the question of how much change in Y we expect to see from a one unit increase in X o A one unit increase in our IV growth is expected to lead to a 0 62 increase in our dependent variable incumbent vote We can tell that there are points that lie above and below our line therefore we know that our model does not perfectly fit the real world Measuring our uncertainty about the OLS Regression Line With an OLS model we have several different ways in which to measure uncertainty o We discuss these measures in terms of the overall fit between X and Y first and then discuss the uncertainty about individual parameters Our uncertainty about individual parameters is used in the testing of our hypotheses Set summary statistics about the entire model Measures of the variation in our model Statistics on the model s parameter estimates Dependant Variable Independent Variable cons constant How can we tell how well a line fits the data 1 The size of the residuals a The root mean squared error b The smaller the better 2 The proportion of variation in Y explained by X a The 2 statistic b The larger the better X Explains a Little X Explains a Lot Y X Y X Measures of the overall fit between a regression model and the dependent variable are Goodness of Fit called goodness of fit measures o Root Mean Squared Error o R Squared Statistic Root Mean Squared Error o One of the most intuitive of all the measures o Provides a measure of the average accuracy of the model in the metric of the dependent variable o The squaring and then taking the square root of the quantities in this formula are done to adjust for the fact that some of …


View Full Document

FSU POS 3713 - Exam 4

Documents in this Course
Ch. 1

Ch. 1

10 pages

Notes

Notes

22 pages

EXAM #1

EXAM #1

40 pages

Exam 3

Exam 3

4 pages

Midterm 1

Midterm 1

18 pages

Midterm 2

Midterm 2

36 pages

Midterm

Midterm

22 pages

EXAM 1

EXAM 1

34 pages

Midterm 2

Midterm 2

36 pages

Test 3

Test 3

3 pages

Test 1

Test 1

5 pages

Test 3

Test 3

8 pages

Midterm 1

Midterm 1

20 pages

Midterm 3

Midterm 3

24 pages

Midterm 3

Midterm 3

24 pages

Midterm 1

Midterm 1

19 pages

Exam 3

Exam 3

19 pages

Exam 2

Exam 2

17 pages

Exam 4

Exam 4

23 pages

Midterm 2

Midterm 2

12 pages

TEST 1

TEST 1

40 pages

UNIT 1

UNIT 1

21 pages

Load more
Download Exam 4
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Exam 4 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Exam 4 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?