UCSB ECON 240 - Exploratory Data Analysis - D1787044

Home> Schools> University of California, Santa Barbara> East Asian Cultural Studies (ECON) > ECON 240> Exploratory Data Analysis

DOC PREVIEW

UCSB ECON 240 - Exploratory Data Analysis

School name University of California, Santa Barbara

Course Econ 240- STUDY SOUP

Pages 4

This preview shows page 1 out of 4 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 4 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 4 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

I. The Returns Generating ProcessII. House Price and Multiple Regression#2. Percent of Household Income Spent on LotteriesNov. 3, 2010 LAB #6 ECON 240A-1 L. PhillipsExploratory Data Analysis, Scatterplots, Regression and ANOVA I. `This first example uses the Anscombe data set, four data files of eleven observations each on the dependent and explanatory variable. Open the data file in Eviews and select the four variables x1, x2, x3, x4, along with y1, y2, y3, and y4. Go to the View menu, open selected, one window, one group. You should see the eight variables by observation in the spreadsheet view or table.A common practice is to rush and run a regression. This can often be fatal to understanding the relationship between the variables. For example, go to the quick menu and select estimate equation. In the equation specification box type y1 c x1, and hit the OK button. Note that the estimated intercept is 3.0 and the estimated slope is 0.5, and the coefficient of determination is 0.666. For diagnostics, in the equation window, go to the view menu and select actual, fitted, residual: graph. Now repeat this procedure for each of the other three data sets. Are you enlightened yet?As an alternative exploratory procedure, return to the workfile window. In the main Eviews menu, select quick: graph and in the window type in x1 y1 and hit the OK button. For graph type, choose scatter diagram, and hit the option button and select the regression line box and hit OK. Repeat this procedure for the remaining three data sets. Sometimes, a picture is worth a 1000 words. This is one of the points of using visual techniques in exploratory data analysis before wheeling up the heavy artillery.I. The Returns Generating ProcessThe second exercise uses a data file from Chapter 17 of the text, Xr17-47, problem 17.47, p. 608, 7th edition, not in the 8th edition. This monthly data begins in January 1993 and ends with December 1996. The authors do not use net returns, i.e. net of the risk free rate. a. Show that this affects the interpretation of the intercept but not of the slope or the coefficient of determination.In Eviews, go to the File menu, select new, and workfile. In the box, click monthly, and for the dates 93.01 and 96.12, and click OK. In the workfile window, select procs, import, read text-Lotus-Excel. Select the Xr17-47 file in the Lab Six folder from Econ240a folder in the classes folder, and hit the open button. The dataNov. 3, 2010 LAB #6 ECON 240A-2 L. PhillipsExploratory Data Analysis, Scatterplots, Regression and ANOVA begins in cell A2. Type in 2 for the number of series, and hit the OK button. In the workfile window, select GE and s_p_index01. In the view menu, select open selected, one window, one group. You should be in the spreadsheet view. In the view menu, go to multiple graphs:line and you will see plots of each series against time. In the view menu, choose descriptive stats: common sample (since each have 48 observations. Close the group window and select GE. Go to the view menu, open selected, one window. In the view menu, choose descriptive statistics: histogram-stats. The coefficient of skewness, zero for the normal distribution is not significant, and the coefficient of kurtosis, three for a normal distribution is not significant either, as reflectedby the Jarque-Bera statistic with probability 0.545. Thus the 48 monthly returns for the GE stock are not significantly different from normal. Select the stock index, Standard andPoor’s Composite, and repeat this procedure. It also looks normal. Go to the quick menu,graph and in the window type in s_p_index01 GE and hit the OK button. For graph type, choose scatter diagram, and hit the option button and select the regression line box and hit OK. Go to the quick menu, select estimate equation, and type in ge c s_p_index01. b. Is the slope significantly different from one? What does this finding mean?c. How much of the variation in the monthly returns to GE stock is attributable to the market?Go to the view menu and select actual, fitted, residuals: graph. Does the equation look OK? Go to the view menu, residual tests: histogram-normality test.d. Are the residuals normal?II. House Price and Multiple RegressionThe third exercise is from the text and is a preview of coming attractions in Econ 240B. This is the data file XM18-02, example 18.2, p. 646 See pp 692-696 in 8th Ed. File Xm17-02.). There are 100 observations on homes with price, number of bedrooms, housesize in square feet, and lot size in square feet. This data set was imported into EViews. Select bedrooms, lot_size01, house_ size01 and price. In the view menu, select open selected, one window, one group. You should be in the spreadsheet view. In the view menu, go to multiple graphs:scatter: matrix of all pairs. In the last row, you will seeNov. 3, 2010 LAB #6 ECON 240A-3 L. PhillipsExploratory Data Analysis, Scatterplots, Regression and ANOVA the scatter plots of price against the other three variables. It looks like price is positively associated with all three variables.The text regresses price against an intercept, number of bedrooms, house size and lot size. However from the scatter plots, it is apparent that house size and lot size are highly correlated. Try a scatter plot of just these two variables, by selecting these two, going to the quick menu, graph, and selecting scatter for type with a regression line as an option. Also, you can select these two variables, go to the view menu, select open selected, one window, one group. You should be in the spreadsheet view. In the view menu, select correlations. The correlation coefficient is 0.994. These two explanatory variables are highly correlated and are not providing separate variation explaining house price. This is called multicollinearity between the explanatory variables, and causes large standard errors for the slope coefficients for these explanatory variables, and hence low t-statistics, eventhough the coefficient of determination is high. One remedy is to regress price against a constant, bedrooms and house size.a. Interpret the estimated regression coefficient on house size.b. Interpret the estimated coefficient on bedrooms.IV. Exercises#1. An alternative to the regression of price on a constant, number of bedrooms and house size would be to estimate a separate intercept for two bedroom houses, three bedroom houses, etc., similar to the approach in Lab Five where we estimated separate intercepts for

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1 out of 4 pages.

UCSB ECON 240 - Exploratory Data Analysis

Sign up for free to view:

Please select your school