Final Exam Study Guide: Analysis of Economic DataTypes of Economic Data:- Time Series Data: Data on the same company for different periods of time- Cross Sectional Data: Date on different companies for the same year- Panel Data: Time Series + Cross SectionalGraphical Methods:- Time Series Plot: - Used for time series data to show the value of a variable over time- Time is measured on the horizontal axis- The value of the variable is measured on the vertical axis- Histogram (Frequency Distribution) is a table containing:- Categories within which the data fall- The corresponding frequencies with which data fall within each category- Each category has the same width- To determine width: w= largest #-smallest #/# of desired categories- Intervals never overlap- Histogram:- A plot of the data in the frequency distribution- The interval endpoints are shown on the horizontal axis- The vertical axis measures the frequency, the relative frequency, or the percentage- Scatter Plots (X-Y Plots):- Used for paired observations taken from two numerical variables- One variable is measured on the vertical axis and the other variable is measured on the horizontal axisDescriptive Statistics:- Mean:- Most common measure of central tendency- It is the sum of the values of all observations divided by the number of observations- Drawback: Affected by extreme values (outliers)- Median:- In an ordered list, the median is the “middle” number- Not affected by outliers- Mode:- Value that occurs more often- Not affected by outliers- It is typically used when we work with qualitative or categorical data- Drawback: Might not exist or might not be unique- Measures of Dispersion:- Tell us how spread out our data isRobert Kaufman Page 1 of 9- Variance:- It is (approximately) the average of squared deviations from the mean- Standard Deviation:- Most common measure of dispersion- Sample standard variation is the square root of the sample standard variance- Shows variation about the mean- Has the same unit as the original dataMeasures of the Shape of the Distribution:- Skewness:- Measure of the symmetry in the data distribution, i.e. if the distribution of data looks the same to the left and to the right of the center point- Skewness Distribution:- Mean = Median- The left tail looks the same as the right tail- Skewness = 0- Left Skewed Distribution:- Mean < Median- The left tail is long relative to the right- Skewness < 0- Right Skewed Distribution:- Mean > Median- The right tail is long relative to the left- Skewness > 0Covariance:- Cov(x,y) is a measure of the linear relationship b/w X and Y- If Cov(x,y) > 0: increasing linear relationship b/w X and Y-When X increases Y increases and vice versa- If Cov(x,y) < 0: decreasing linear relationship b/w X and Y- When X increases Y decreases and vice versa- Two problems when using the covariance are:- The covariance can be affected by the units of measure of the two variables- The covariance does not provide a measure of the strength of the relationship b/w two variablesCorrelation:- The correlation coefficient overcomes these two shortcomings of the covariance by standardizing the linear relationship b/w two variables- Range of r: -1 ≤ rxy ≤ +1- Sign of r:- rxy > 0 indicates an increasing relationship b/w X and YRobert Kaufman Page 2 of 9- rxy < 0 indicates a decreasing relationship b/w X and Y- rxy = 0 indicates that X and Y are unrelated- Strength of Relationship:- The closer rxy is to 1, the stronger the positive relation b/w X and Y- The closer rxy is to -1, the stronger the negative relation b/w X and Y- Correlation is a measure of the strength of the linear relationship b/w two variables (but it doesn’t imply causality)The Linear Regression Model:- yi = β0 + β1xi + εi- y is the dependent variable- x is the independent variable or regressor- ε is the error termPopulation Linear Regression Model:- The population regression line is yi = β0 + β1xi- β1 is the change in y associated with a unit change in x- β0 is the value of the population regression line when x = 0- The error term contains all the other factors besides x that determine the value of the dependent variable for a specific observation i- Population measures are unobservable, while the sample measures are observable- The population regression line provides the expected value of the random variable y when X takes on a specific value xiOrdinary Least Squares (OLS):- OLS Estimator: chooses regression coefficients so that the estimated regression line is as close as possible to the observed data- Closeness: measured by the sum of the squared residuals- The error is the difference between a particular data point and the population regression line- The residual is the difference between a particular data point and the estimated regression line- The OLS estimator does not prove causality. In order to relate causality with the OLS estimator we need economic theory behind the estimation.Goodness of Fit:- Features of R2:- Is the fraction of the sample variance of Yi that can be explained (or predicted) by Xi- Ranges between zero and one- An R2 ≈ 1 indicated that Xi is good at predicting Yi- An R2 ≈ 0 indicated that Xi is not good at predicting YiFactors that affect the accuracy of the OLS estimate :βRobert Kaufman Page 3 of 9- Sample Size (N): - The larger the sample size, the more accurate the estimate of β- Dispersion of the error term:- The larger the variance of the error term, the less accurate the estimate of β- Dispersion of the independent variable X:- The larger the variance of X, the more accurate the estimate of βConfidence Intervals for β:- The confidence level is (1 – α ) x 100%- For example if α = .05, then we get a 95% confidence interval-A 99% confidence interval is wider than a 95% confidence interval- If zero is contained in the interval, then it is very likely that β1 = 0- That X does not have any relationship with Y- The smaller the α the more confidence we have about he true value being contained in the intervalHypothesis Testing β=0:-Elements of a Statistical Test:- Null Hypothesis H0 => Want to Reject!- Alternative Hypothesis H1 => Want to Accept!- Test Statistic- Rejection RuleTest Statistic and Decision Rule:- Test H0 : β1 = 0 vs. H1 : β1 ≠ 0- Test Statistic (tc)- Reject H0 if: |tc| > t α- Where tα is the critical value- The critical value, tα, is chose so that the probability of rejecting H0 when H0 is true is equal to αDecision Process:- The
View Full Document