14.385Nonlinear EconometricsLecture 4.Theory: Asymptotic Distribution of GMM/Nonlinear IVApplication: Revisit probits and logits. Multinomialchoice.Topics to be covered in TA Session:Testing (Parallels OLS)Variance Estimation (Parallels OLS)1Cite as: Victor Chernozhukov, course materials for 14.385 Nonlinear Econometric Analysis, Fall 2007. MIT OpenCourseWare(http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].Asymptotic Normality of GMM and Nonlinear IV.Recall Idea: Estimate parameters by setting samplemoments to be close to population counterpart.Definitions:β : p × 1 parameter vector, with true value β0.gi(β) = g(zi, β) : m × 1 vector of functionsof ithdata observation ziand parameter.Model (or moment restriction):E[gi(β0)] = 0.Definitions:ˆg(β) := En[gi(β)] : Sample averages.Aˆ: m × m positive definite matrix.Lead Examples:IV: gi(β) = (Yi− Xiβ)Zi, Aˆ= Vˆar[gi(β0)]−1NIV: gi(β) = f(Yi, Xi, β)Zi, Aˆ= Vˆar[gi(β0)]−1MLE: gi(β) = ∇ln f(Zi, β), Aˆ= IM: gi(β) = ∇m(Zi, β), Aˆ= I.GMM ESTIMATOR:βˆ= arg min ˆg(β)0Aˆˆg(β).β2Cite as: Victor Chernozhukov, course materials for 14.385 Nonlinear Econometric Analysis, Fall 2007. MIT OpenCourseWare(http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].This is a special case of extremum estimator, so thearguments of the previous type can be applied to getthe following result.ASYMPTOTIC NORMALITY OF GMM: If the dataare i.i.d. or stationary strongly mixing with rate greaterthan βˆptwo, → β0and i) β0is in the interior of the pa-rameter set over which minimization occurs; ii) gi(β)is continuously differentiable on a neighborhood N ofpβ0; iii) E[supβk∇gi(β)k] is finite; iv) Aˆ→ A positive∈Ndefinite and G0AG is nonsingular, for G = E[ gi(β0)];v) for i.i.d. data, Ω = E[g∇i(β0)gi(β0)0] is finite and formixing data Ω = limnV ar[ˆg(θ0)] exists and is finite, then√1→dn (βˆ− β0) N(0, V ),V = (1G0AG)−G0AΩAG(G0AG)−.Proof: For Gˆ= ∇ˆg(βˆ), we have the FOC,0 = Gˆ0Aˆˆg(βˆ).We can expand them as0 = Gˆ0Aˆ{ˆg(β0) + ∇ˆg(β¯)[βˆ− β0]},where ∇ˆg(β¯) stands for the matrix whose each row eval-uated at (a row-dependent) β¯located on the line joiningθ0and θˆ, and solve for√1n(βˆ− β0) = −[Gˆ0Aˆ∇ˆg(β¯)]−Gˆ0Aˆ√nˆg(β0)By the ULLN and the CMT, we have that−[Gˆ0Aˆ∇pˆg(¯1β)]−Gˆ0Aˆ→ −(G0AG)−1G0A.Cite as: Victor Chernozhukov, course materials for 14.385 Nonlinear Econometric Analysis, Fall 2007. MIT OpenCourseWare(http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].Application of the CLT gives us√→dnˆg(β0) N(0, Ω).Applying Slutsky, we get√→dn(βˆ− β0) −(G0AG)−1G0A · N(0, Ω).Notes: (1) Aˆaffects V only through plim(Aˆ).(2) If m = p, then=1G−1V ΩG− 0,and A drops out. Thus, the choice of the matrix A hasno effect on asymptotic variance in this case. We havem = p for MLE, M-estimators, and “exactly identified”GMM. We have that m > p for “overidentified” GMM.(3) The optimal choice of A is given by A ∝ Ω−1, inwhich case= (0Ω−1)−1V G G .(4) MLE. Assume i.i.d. data. For MLE, under correctspecification:g(z, θ) = ∇ln f(2z, θ), G = E∇ ln f(z, θ),Ω = var[∇ln f(z, θ0)], and − G = Ωso thatV = −G−1= Ω−1.Cite as: Victor Chernozhukov, course materials for 14.385 Nonlinear Econometric Analysis, Fall 2007. MIT OpenCourseWare(http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].The fact that G = Ω is known as information matrixequality, which holds under some regularity conditions.(5) M-estimators, including MLE under incorrectspecification. We haveg(z, θ) = ∇m(z, θ), G = E∇2m(z, θ0),Ω = Var[∇m(z, θ0)],so thatV = G−1ΩG−1.This is known as Huber’s sandwich formula or, moresimply, robust variance-covariance matrix.(6) Linear IV. Illustrate plausibility of conditions withLinear IV:interior parameter condition (i) holds by assumption;continuous differentiability (ii) holds by linearity ofgi(β) = Zi(yi− Xi0β)in β; dominance condition (iii) holds as long as secondmoments exist, by∇gi(β) = −ZiXi0G = −EZiXi0; (iv) holds as long as A is nonsingular and G = E[ZiXi0]has full column rank; and (v) holds as long as−Ω = E(Yi− Xi0β0)2ZiZi0Cite as: Victor Chernozhukov, course materials for 14.385 Nonlinear Econometric Analysis, Fall 2007. MIT OpenCourseWare(http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].is finite. Then, with A ∝ Ω−1,V = (G0Ω−1G)−1.(7) As stated in the theorem, i.i.d. sampling can bereplaced by strict stationarity and strong mixing withrate larger than 2, that is provided mixing coefficientsα(j) go to zero at rate j−αas j → ∞, for α > 2, in whichcase the limit variance takes the formΩ = lim V ar[√n→∞nˆg(θ0)].(8) However, in many cases we can use the CLT for mar-tingale difference sequences, which is much more rele-vant in dynamic economic applications. For instance, inHansen and Singleton gi(β0) being a martingale differ-ence sequence is implied by economic assumptions. Inthis case, the limit variance is the same as in the i.i.d.case:Ω = V ar[gi(θ0)] = E[gi(β0)gi(β0)0].Cite as: Victor Chernozhukov, course materials for 14.385 Nonlinear Econometric Analysis, Fall 2007. MIT OpenCourseWare(http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].Revisit Maximum Likelihood Estimation for BinaryChoice:ln fi(β) = yilog F (x0iβ) + (1³− yi) log(1 − F (x0iβ))y∇ln fi(β) =i− F (x0iβ)F (x0iβ) · (1 − F (x0iβ))CLT for the score´f(x0iβ)xi√f2nEn∇ln fi(β0) →dN(0, EixFi(1 −ix0Fi)i)since V ar(yi− Fi|xi) = Fi(1 − Fi). Therefore:√f2n(βˆ− β) →dN(0, [Ei0 1xi(1 −ixFi)i]−)FThese variance formula is valid only under correct spec-ification. Under incorrect specification, treat as an M-estimator, and use the Huber’s robust sandwich formula.Probit and Logit Examples:For probit:√−dn(βˆ− β) → N(0φ2, EixΦi(1−Φi)ix0i]−1).For logit:√1n(βˆ− β) →dN(0, [EΛ(x0iβ)(1 − Λ(x0iβ))xix0i]−).You can also do NLS by minimizingβ˜= arg min2En(yi− F (x0iβ))3Cite as: Victor Chernozhukov, course materials for 14.385 Nonlinear Econometric Analysis, Fall 2007. MIT OpenCourseWare(http://ocw.mit.edu), Massachusetts Institute of
View Full Document