Slide 1IntroductionPurposeBILOG – MG (Mislevy & Bock 1985)WinBUGSLiterature ReviewLiterature ReviewLiterature ReviewDataMethodsPriorsCriterion-Root Mean Square Error (RMSE)ResultsResults-cont.Results-cont.1. Running conditions for WinBUGS2. Effect of Sample SizeBILOG-MG vs. WinBUGS – a parameterBILOG-MG vs. WinBUGS - b parameterBILOG-MG vs. WinBUGS - c parameterDiscussion & ConclusionsDiscussion & Conclusions-cont.LimitationsWinBUGS code for running 3PLTrue Item ParmaetersAcknowledgementQuestions?Item Parameter Estimation: Does WinBUGS Do Better Than BILOG-MG?Bayesian Statistics, Fall 2009Chunyan Liu & James GambrellIntroduction3 Parameter IRT ModelAssigns each item a logistic function with a variable lower asymptote.PurposeCompare BILOG-MG and WinBUGS estimation of item parameters under the 3 parameter logistic (3PL) IRT modelInvestigate the effect of sample size on the estimation of item parametersBILOG – MG (Mislevy & Bock 1985)Propriety softwareUses unknown estimation shortcutsSometimes gives poor results“Black Box” programVery fast estimationProvides only point estimates and standard errors for model parametersEstimation method◦Marginal Maximum Likelihood◦Expectation-Maximization algorithm (Bock and Aitkin, 1981)WinBUGSMore open-source (related to OpenBugs)More widely studiedMight give more robust resultsMuch more flexibleProvides full posterior densities for model parametersMore output to evaluate convergenceVery slow estimation!Literature ReviewMost researchers have used custom-built MCMC samplers using Metropolis-Hastings- within-Gibbs algorithm ◦as recommended by Cowles, 1996!Patz and Junker (1999a & b)◦Wrote MCMC sampler in S plus◦Found that their sampler produced estimates identical to BILOG for the 2PL model, but had some trouble with 3PL models.◦Found MCMC was superior at handling missing data.Literature ReviewJones and Nediak (2000)◦Developed “commercial grade” sampler in C++◦Improved the Patz and Junker algoritm◦Compared MCMC results to BILOG using both real and simulated data◦Found that item parameters varied substantially, but the ICCs described were close according to the Hellinger deviance criterion◦MCMC and BILOG were similar for real data◦MCMC was superior for simulated data◦Note that MCMC provides much more diagnostic out to assess convergence problemsLiterature ReviewProctor, Teo, Hou, and Hsieh (2005 project for this class!)◦Compared BILOG to WinBUGS◦Fit a 2PL model◦Only simulated a single replication◦Did not use deviance or RMSE to assess errorDataTest: 36-item multiple choiceItem parameters (a, b and c) come from Chapter 6 of Equating, Scaling and Linking (Kolen and Brennan)◦Treated as true item parameters (See Appendix)Item responses simulated using 3PL model a – slope b – difficulty c – guessing – examinee ability1( )1 exp( 1.7 ( ))cp ca bqq-= ++ - -qMethods1. N (N=200, 500, 1000, 2000) θ values were generated from N(0,1) distribution. 2. N item responses were simulated based on the θ’s generated in step 1 and the true item parameters using the 3PL model. 3. Item parameters (a, b, c for the 36 items) were estimated using BILOG-MG based on the N item responses. 4. Item parameters (a, b, c for the 36 items) were estimated using WinBUGS based on the N item responses using the same prior as specified by BILOG-MG. 5. Repeat steps two and four 100 times. For each item, we have 100 estimated parameter sets from both programsPriors a[i] ~ dlnorm(0, 4) b[i] ~ dnorm(0, 0.25) c[i] ~ dbeta(5,17)Same priors used in BILOG and WinBUGSCriterion-Root Mean Square Error (RMSE)For each item, we computed the RMSE for a, b, and c using the same formula where and Here could be , , or and x could be the parameter of a, b or c2 2ˆ ˆ ˆ( ) ( ) ( )RMSE x Bias x sd x= +ˆ ˆ( ) ( )Bias x E x x= -1001ˆˆ( )100iixE x==�ˆaˆbˆcˆxResults1. Deciding the number of Burn-in Iterations- History Plotsa[28] chains 1:3iteration1 2000 4000 6000 0.0 2.0 4.0 6.0b[28] chains 1:3iteration1 2000 4000 6000 -4.0 -2.0 0.0 2.0 4.0c[28] chains 1:3iteration1 2000 4000 6000 0.0 0.1 0.2 0.3a[28] chains 1:3lag0 20 40 -1.0 -0.5 0.0 0.5 1.0b[28] chains 1:3lag0 20 40 -1.0 -0.5 0.0 0.5 1.0c[28] chains 1:3lag0 20 40 -1.0 -0.5 0.0 0.5 1.0a[28] chains 1:3start-iteration1051 2000 3000 0.0 0.5 1.0 1.5b[28] chains 1:3start-iteration1051 2000 3000 0.0 0.5 1.0 1.5c[28] chains 1:3start-iteration1051 2000 3000 0.0 0.5 1.0 1.5Results-cont.1. Deciding the number of Burn-in Iterations- Autocorrelation and BGR plotsResults-cont.1. Deciding the number of Burn-in Iterations- Statistics node mean sd MC error 2.5% median 97.5% start samplea[1] 0.899 0.1011 0.004938 0.7117 0.8949 1.107 2501 3500a[2] 1.339 0.1159 0.004132 1.125 1.333 1.58 2501 3500a[3] 0.7308 0.111 0.005769 0.551 0.717 0.9893 2501 3500a[4] 2.012 0.2712 0.009897 1.531 1.996 2.59 2501 3500a[5] 1.766 0.2202 0.009585 1.394 1.745 2.243 2501 3500b[1] -1.706 0.2944 0.01793 -2.253 -1.717 -1.1 2501 3500b[2] -0.4277 0.1167 0.005916 -0.6571 -0.428 -0.1857 2501 3500b[3] -0.7499 0.3967 0.01586 -1.409 -0.7994 0.1348 2501 3500b[4] 0.4324 0.09295 0.004443 0.2363 0.4384 0.6008 2501 3500b[5] -0.05619 0.122 0.006737 -0.3127 -0.05246 0.1657 2501 3500c[1] 0.2458 0.088 0.004718 0.09253 0.2415 0.4362 2501 3500c[2] 0.1403 0.04745 0.002158 0.05368 0.139 0.2361 2501 3500c[3] 0.2538 0.09285 0.005864 0.09991 0.243 0.4557 2501 3500c[4] 0.2669 0.035 0.001491 0.1911 0.2693 0.3282 2501 3500c[5] 0.2588 0.05029 0.002589 0.1526 0.261 0.35 2501 35001. Running conditions for WinBUGSAdaptive phase: 1000 iterationsBurn-in: 1500 iterationsFor computing the Statistics: 3500 iterationsUsing 1 chainUsing bugs( ) function to run WinBUGS through R◦Need BRugs and R2WinBUGS packagesResults-cont.2. Effect of Sample SizeResults-cont.0 5 10 15 20 25 30 3500.10.20.30.40.50.6BILOG-MG aN=200N=500Ite mRMS E0 5 10 15 20 25 30 3500.10.20.30.40.50.60.70.80.9BILOG-MG bN=200N=500Ite mRMS E0 5 10 15 20 25 30 3500.040.080.120.160.2BILOG-MG cN=200N=500Ite mRMS E0 5 10 15 20 25 30 3500.10.20.30.40.50.6WinBUGS aN=200N=500Ite mRMS E0 5 10 15 20 25 30 3500.10.20.30.40.50.60.70.80.9WinBUGS bN=200N=500Ite mRMS E0 5 10 15 20 25 30 3500.040.080.120.160.2WinBUGS cN=200N=500Ite mRMS EBILOG-MG vs.
View Full Document