**Unformatted text preview:**

Final Exam Study Guide Analysis of Economic Data Types of Economic Data Time Series Data Data on the same company for different periods of time Cross Sectional Data Date on different companies for the same year Panel Data Time Series Cross Sectional Graphical Methods Time Series Plot Used for time series data to show the value of a variable over time Time is measured on the horizontal axis The value of the variable is measured on the vertical axis Histogram Frequency Distribution is a table containing Categories within which the data fall The corresponding frequencies with which data fall within each category Each category has the same width To determine width w largest smallest of desired categories Intervals never overlap Histogram A plot of the data in the frequency distribution The interval endpoints are shown on the horizontal axis The vertical axis measures the frequency the relative frequency or the percentage Scatter Plots X Y Plots Used for paired observations taken from two numerical variables One variable is measured on the vertical axis and the other variable is measured on the horizontal axis Descriptive Statistics Mean Most common measure of central tendency It is the sum of the values of all observations divided by the number of observations Drawback Affected by extreme values outliers Median Mode In an ordered list the median is the middle number Not affected by outliers Value that occurs more often Not affected by outliers It is typically used when we work with qualitative or categorical data Drawback Might not exist or might not be unique Measures of Dispersion Tell us how spread out our data is Robert Kaufman Page 1 of 9 Variance It is approximately the average of squared deviations from the mean Standard Deviation Most common measure of dispersion Sample standard variation is the square root of the sample standard variance Shows variation about the mean Has the same unit as the original data Measures of the Shape of the Distribution Skewness Measure of the symmetry in the data distribution i e if the distribution of data looks the same to the left and to the right of the center point Skewness Distribution Mean Median The left tail looks the same as the right tail Skewness 0 Left Skewed Distribution Mean Median The left tail is long relative to the right Skewness 0 Right Skewed Distribution Mean Median The right tail is long relative to the left Skewness 0 Covariance Cov x y is a measure of the linear relationship b w X and Y If Cov x y 0 increasing linear relationship b w X and Y When X increases Y increases and vice versa If Cov x y 0 decreasing linear relationship b w X and Y When X increases Y decreases and vice versa Two problems when using the covariance are The covariance can be affected by the units of measure of the two variables The covariance does not provide a measure of the strength of the relationship b w two variables Correlation The correlation coefficient overcomes these two shortcomings of the covariance by standardizing the linear relationship b w two variables Range of r 1 rxy 1 Sign of r rxy 0 indicates an increasing relationship b w X and Y Robert Kaufman Page 2 of 9 rxy 0 indicates a decreasing relationship b w X and Y rxy 0 indicates that X and Y are unrelated Strength of Relationship The closer rxy is to 1 the stronger the positive relation b w X and Y The closer rxy is to 1 the stronger the negative relation b w X and Y Correlation is a measure of the strength of the linear relationship b w two variables but it doesn t imply causality The Linear Regression Model yi 0 1xi i y is the dependent variable x is the independent variable or regressor is the error term Population Linear Regression Model The population regression line is yi 0 1xi 1 is the change in y associated with a unit change in x 0 is the value of the population regression line when x 0 The error term contains all the other factors besides x that determine the value of the dependent variable for a specific observation i Population measures are unobservable while the sample measures are observable The population regression line provides the expected value of the random variable y when X takes on a specific value xi Ordinary Least Squares OLS OLS Estimator chooses regression coefficients so that the estimated regression line is as close as possible to the observed data Closeness measured by the sum of the squared residuals The error is the difference between a particular data point and the population regression line The residual is the difference between a particular data point and the estimated regression line The OLS estimator does not prove causality In order to relate causality with the OLS estimator we need economic theory behind the estimation Goodness of Fit Features of R2 Is the fraction of the sample variance of Yi that can be explained or predicted by Xi Ranges between zero and one An R2 1 indicated that Xi is good at predicting Yi An R2 0 indicated that Xi is not good at predicting Yi Factors that affect the accuracy of the OLS estimate Robert Kaufman Page 3 of 9 Sample Size N Dispersion of the error term The larger the sample size the more accurate the estimate of The larger the variance of the error term the less accurate the estimate of Dispersion of the independent variable X The larger the variance of X the more accurate the estimate of Confidence Intervals for The confidence level is 1 x 100 For example if 05 then we get a 95 confidence interval A 99 confidence interval is wider than a 95 confidence interval If zero is contained in the interval then it is very likely that 1 0 That X does not have any relationship with Y The smaller the the more confidence we have about he true value being contained in the interval Hypothesis Testing 0 Elements of a Statistical Test Null Hypothesis H0 Want to Reject Alternative Hypothesis H1 Want to Accept Test Statistic Rejection Rule Test Statistic and Decision Rule Test H0 1 0 vs H1 1 0 Test Statistic tc Reject H0 if tc t Where t is the critical value The critical value t is chose so that the probability of rejecting H0 when H0 is true is equal to Decision Process The Null H0 is held true unless we find enough evidence against it The critical value is chosen to minimize the probability of making a mistake Critical Value and Level of Significance The level of significance is the probability of a Type I error Denote the level of significance with Greek letter Connection with Confidence Interval If we build a 95 Confidence

View Full Document