UI STAT 4520 - Bayesian Regression Models - D1891781

Home> Schools> University of Iowa> Statistics (STAT) > STAT 4520> Bayesian Regression Models

UI STAT 4520 - Bayesian Regression Models

Pages 16

Download Save

Unformatted text preview:

1Industry-Level Returns and Fama-French Factors In Bayesian Regression Models Chongyi Shen Fan Yang Qinbin Fan Abstract The prediction of index returns has been extensively studied using various models. In this paper, we use Bayesian method to compare two classical models on predicting index returns. We employ Markov Chain Monte Carlo method via WinBUGS to estimate Bayesian model coefficients. Meanwhile, a complete comparison and discussion of full and reduced models under Bayesian treatment is performed by comparing DIC values. Our findings are consistent with the Fama-French three-factor hypothesis. 1. Introduction 1.1 Motivation CAPM uses a single factor, Market Equity Returns, to compare a portfolio with the market as a whole. But more generally, you can add factors to a regression model to give better statistics. The best known approach like this is the Fama-French three- factor model developed by Gene Fama and Ken French. One thing that's interesting is that Fama and French still see high returns as a reward for taking on high risk; in particular that means that if returns increase with book/price, then stocks with a high book/price ratio must be more risky than average - exactly the opposite of what a traditional business analyst would tell you. Fama and French aren't particular about why book/price measures risk, although they and others have suggested some possible reasons. For example, high book/price could2mean a stock is "distressed", temporarily selling low because future earnings look doubtful. Or, it could mean a stock is capital intensive, making it generally more vulnerable to low earnings during slow economic times. Those both sound plausible; but they seem to be describing completely different situations. It may be that the success of this model at explaining past performance isn't due to the significance of any of the three factors taken separately, but in their being different enough that taken together they do an effective job of "spanning the dimensions" of the market. In our paper, we want to see that if the Fama-French three-factor model performs better than the normal single-factor model. 1.2 Data Industry level returns and three-factor data is taken from Kenneth R. French’s data bank. We use average monthly value for Auto industry returns and three factors. The original data spans July 1926 to present. We use a subset of this data, from January 2003 to December 2008. There are 72 observations available for each series. The response variable is: Y : Auto Industry Returns The predictors are: X1: Market Equity Returns X2:Small Minus Big (SMB: small market capitalization minus the big, measures the additional return investors have historically received by investing in stocks of companies with relatively small market capitalization. This additional return is often referred to as the “size premium”.) X3: High Minus Low (HML: high book-to-price ratio3minus the low one, has been constructed to measure the “value premium” provided to investors for investing in companies with high book-to-market values, essentially, the value placed on the company by accountants as a ratio relative to the value the public markets placed on the company, commonly expressed as B/M) The complete dataset is presented in section 7.2. 2. Methods 2.1 Single-Factor and Three-Factor Models Two theory-based models are going to be compared here Traditional Single-Factor Model: Yi= a + b1X1i + ei which only considers Market Equity Returns as the predictor. Three-Factor Model: Yi= a + b1X1i + b2X2i + b3X3i + ei By Gene Fama and Ken French’s theory, this model gives better fit of the market returns data. For simplification, we will refer the first model the ‘Reduced’ one and the second model the ‘Full’ one later in this text. 2.2 Likelihood Function, Priors We set normal likelihood function and independent normal priors for each coefficient with very small precision. Likelihood Function: y[i] ~ dnorm( mu[i], tau)4Vague priors: a, bj ~ N (0, 0.000001) for j =1, 2, 3 tau ~ Gamma (0.001, 0.001) 2.3 Initial Values The way we choose initial values is that we estimate the coefficients a and bi’s by using the first half of data (Jan 03 ~ Dec 05). To achieve this, frequentist’s methods are used to get initial values. Since index returns associated with the month variable are time series data, we should also consider cyclical component in our model. By using the procedure AUTOREG with SAS, the fitted model including AR(1) factor for forecasting and monitoring are obtained. 7063564942352821147120100-10-20-30-40Index of Variable "Month"Auto Industry ReturnsTime Series Plot of Auto Industry Returns Referring to section 7.3, from the output we can see the coefficient for b1 (Market Equity Returns) in the traditional single-factor model is 1.9105 and the coefficients for b1 (Market Equity Returns) b2 (Small Minus Big) and b3 (High Minus Low) are 1.778, 0.2727 and 0.3628, with insignificant AR(1) factor in each model.52.4 Regression Statistics We use MCMC simulation and compare the Deviance Information Criterion (DIC) statistics corresponding to each of these two models. 3. Diagnostics We simulated three chains, with one from the auto regression output from SAS, and the other two from the spread-out of this output. 3.1 Single-Factor Model alpha chains 1:3start-iteration51 200 400 0.0 0.5 1.0 1.5beta1 chains 1:3start-iteration51 200 400 0.0 0.5 1.0 1.5 deviance chains 1:3start-iteration51 200 400 0.0 0.5 1.0sigma chains 1:3start-iteration51 200 400 0.0 0.5 1.0 1.5 From the BGR plots, it looks like the red line starts to be stably close to 1 from the 300th iteration for beta1, therefore a burn-in period of 300 seems to be reasonable and conservative. Then another 2000 iterations were run after setting up ‘DIC’. Below is the DIC output: Dbar = post.mean of -2logL; Dhat = -2LogL at post.mean of stochastic nodes Dbar Dhat pD DIC y 210.744 207.659 3.086 213.830 total 210.744 207.659 3.086 213.830 pD is an estimate of ‘free parameters’ of the model and it is very close to 3. This is due to the fact that in the reduced model, we have three parameters to estimate (alpha, beta1, tau).6DIC is 213.830 for this reduced model. The final model output is: node mean sd MC err 2.5% median 97.5% start sample alpha -3.308 0.7788

View Full Document


School:
Email:
New Password:
Confirm Password:

UI STAT 4520 - Bayesian Regression Models

Sign up for free to view:

Please select your school