Lectures 10 and 11. Bayesian and Quasi-Bayesian Methods Fall, 2007 Cite as: Victor Chernozhukov, course materials for 14.385 Nonlinear Econometric Analysis, Fall 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].Outline: 1. Informal Review of Main Ideas 2. Monte-Carlo Examples 3. Empirical Examples 4. Formal Theory References: Theory and Practice: Van der Vaart, A Lecture Note on Bayesian Estimation Chernozhukov and Hong, An MCMC Approach to Clas- sical Estimation, JoE, 2003 Liu, Tian, Wei (2007), JASA, 2007 (forthcoming). Computing: Chib, Handbook of Econometrics, Vol 5. Geweke, Handbook of Econometrics, Vol 5. Cite as: Victor Chernozhukov, course materials for 14.385 Nonlinear Econometric Analysis, Fall 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].� �� � �Part 1. Informal Introduction An Example (Cher-nozhukov & Hong, 2003) Consider GMM estimator for Instrumental Quantile Re-gression Model: E�τ − 1(Y ≤ D�θ)�Z = 0. Maximize criterion 1 Ln(θ) = −n gn(θ)�W (θ)gn(θ)2Q(θ) with n1 gn(θ) = �(τ − 1(Yi ≤ Di�θ)�Zi n i=1 and n1 �1�ZiZi��−1 W (θ) = τ(1 − τ) n i=1 Computing extremum is problematic. Smoothing does not seem to help much. Some other examples: Nonlinear IV & GMM problems with many local optima. Powell’s censored median regression. 1 Cite as: Victor Chernozhukov, course materials for 14.385 Nonlinear Econometric Analysis, Fall 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].� Overview of Results: 1. Interpret pn(θ) ∝ exp(Ln(θ)) as posterior density, summarizing the beliefs about the parameter. This will encompass the Bayesian learning approach, where Ln(θ) is proper log-likel ihood. Otherwise treat Ln(θ) as a “replacem ent” or “quasi” log-likelihood, and posterior as quasi-posterior. 2. A primary example of an estimator is the posterior mean θˆ= θpn(θ)d θ, Θ which is defined by integration, not by optimization. This estimator is asymptotical ly equivalent to extremum estimator θ∗: √n(θˆ− θ∗) = op(1), and therefore is as effici ent as θ∗ in lar ge samples. For likelihood framework this was formally show n by Bickel and Yahav (1969) and many others. For GMM and other non-likelihood frameworks, this was formally shown by Chernozhukov and Hong (2003, JoE) and Liu, 2 Cite as: Victor Chernozhukov, course materials for 14.385 Nonlinear Econometric Analysis, Fall 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].Tian, Wei (2007, JASA). 3. When a generalized informati on equality holds, namely when the Hessian of the objective function Qˆ(θ) is equal to the variance of the score, 2�θQ(θ0) = var[√n�Q(θ0)], �=:��J(θ0)�� �� �=:Ω(θ0) we can use posterior quantiles of beliefs pn(θ) for infer-ence. This is true for the regular likelihood problem s and optimall y weighted GMM. 4. Numerical integration can be done using Markov Chain Monte Carlo (MCMC), whi ch creates a dependent sample S = (θ(1) , ..., θ(k)), a Markov Chain, whose marginal distribution is C exp(Ln(θ)).· This is done by using the Metropolis-Hastings or Gibbs algorithms or a combination of the two. Compute the posterior mean of S−→ θˆ. Can also use quantiles of the chain S to form confidence regions. Cite as: Victor Chernozhukov, course materials for 14.385 Nonlinear Econometric Analysis, Fall 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].Image courtesy of MIT OpenCourseWare. 3 Cite as: Victor Chernozhukov, course materials for 14.385 Nonlinear Econometric Analysis, Fall 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].� Formal Definitions Sample criterion function Ln(θ) Motivation of extremum estimators: learning by anal-ogy Qˆ= n−1Ln Q, so extremum esti m a tor θ0, the → →extremum of Q. Ln(θ) is not a log-likel ihood function generally, but exp[Ln(θ)]π(θ) pn(θ) = � Θ exp[Ln(θ�)]π(θ�)dθ� (1) or si m pl y pn(θ) ∝ exp[Ln(θ)]π(θ) (2) is a pr oper density for θ. Treat it as a form of posterior beliefs. Here, π(θ) is a weight or prior density that is strictly positive and continuous over Θ. Recall that pr oper posteriors ar ise from a formal Bayesian learning model: pn(θ) = f(θ data) = f(data θ)π(θ)/f(data)| |∝ f(data|θ)π(θ). An example of estimator based on posterior is the pos-terior mean: θˆ= θpn(θ)d θ. Θ 4 Cite as: Victor Chernozhukov, course materials for 14.385 Nonlinear Econometric Analysis, Fall 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].� � Definition 1 The class of QBE minimize the expected loss under the belief pn: θˆ= arg min Epn(ρ(d − θ)) d∈Θ �� � (3) = arg min ρ(d − θ)pn(θ)d θ , d∈ΘΘ where ρ(u) is a penalty or bernoullian loss function: i. ρ(u) = u2,|| ||ii. ρ(u) = �kj=1 |uk|, an absolute deviation loss, iii. ρ(u) = �k (τj − 1(uj ≤ 0))uj, loss function. j=1Loss (i) gives posterior mean as optimal decision. Loss (ii ) gives posterior (componentwi se) median as op- timal decisi on. Loss (iii) gives posterior (componentwise) quantiles as optimal decision. 5 Cite as: Victor Chernozhukov, course materials for 14.385 Nonlinear Econometric Analysis, Fall 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].Computation Definition 2 ((Random Walk) Metropolis-Hastings) Given quasi-posterior density pn(θ), known up to a con-stant, generate �θ(0), ..., θ(B)�by, 1. Choose a starting value θ(0). 2. For j = 1, 2, ..., B, generate ξ(j)= θ(j)+ η(j), η(j) ∼N(0, σ2I), and set θ(j+1) = �ξ(j) with probability ρ(θ(j), ξ(j)) ,θ(j)
View Full Document