14.385 Fall, 2007 Nonlinear Econometrics Lecture 2. Theory: Consistency for Extremum Estimators Modeling: Probit, Logit, and Other Links. 1 Cite as: Victor Chernozhukov, course materials for 14.385 Nonlinear Econometric Analysis, Fall 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].Example: Binary Choice Models. The la-tent outcome is defined by the equation yi ∗ = xi�β − εi, εi ∼ F (·). We observe ∗ yi = 1(yi ≥ 0). The cdf F is completely known. Then P (yi = 1| xi) = P (εi ≤ xi�β|xi) = F (xi�β). We can then estimate β using the log-likeli h ood function Qˆ(β) = En[yiln F (x � iβ)+(1−yi) ln(1−F (x � iβ))]. The resulting MLE are CAN and efficient, un-der regularity co n d i tio n s. The story is that a consumer may have two ∗choices, the utility from one cho i c e is yi = x� iβ − εi and the utility from the other is nor-malized to be 0. We need to estimate the 2 Cite as: Victor Chernozhukov, course materials for 14.385 Nonlinear Econometric Analysis, Fall 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].parameters of the latent utility based on the observed choice frequencies. Estimands: The key parameters to estimate are P [yi = 1|xi] and the p artial effec ts of the kind ∂P [yi = 1|xi]= f(xi�β)βj, ∂xij where f = F� . These parameters are function-als of parameter β and the link F . Choices of F: • Logit: F (t) = Λ(t) = exp(t).1+exp(t)• Probit: F (t) = Φ(t), standard normal cdf. • Cauchy: F (t) = C(t) =12 π 1+ arctan(t), the Cauchy cdf. • Gosset: F (t) = T (t, v), the cdf of t-variable with v d e g r e e s of freedom. Choice of F (·) can be important especially in the tails. The prediction of small and large Cite as: Victor Chernozhukov, course materials for 14.385 Nonlinear Econometric Analysis, Fall 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].probabilities by different models may differ sub-stantially. For ex ampl e , Probit and Cauc h i t links, Φ(t) and C(t), have drastically different tail behavior an d giv e differen t predictions for the same val u e of the in d e x t. See Figure 1 for a theoretical example and Figure 2 for an empirical example. In the housing exampl e , yi records whether a person owns a h o u se or not, and xi consists of an intercept and person’s income. Cite as: Victor Chernozhukov, course materials for 14.385 Nonlinear Econometric Analysis, Fall 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].P−P Plots for various Links F 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 ProbabilityNormal Cauchy Logit Normal Probability Cite as: Victor Chernozhukov, course materials for 14.385 Nonlinear Econometric Analysis, Fall 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].Predicted Probabilities of Owning a House Prediction0.0 0.2 0.4 0.6 0.8 1.0 Normal Cauchy Logit Linear 0.0 0.2 0.4 0.6 0.8 1.0 Normal Prediction Choice of F (·) can be less important when using flexible functional forms. Indeed, for any F we can approximate P [yi = 1|x] ≈ F [P (x)�β], where P (x) is a collection of approximating functions, for example, spl i n e s, powers, or other Cite as: Victor Chernozhukov, course materials for 14.385 Nonlinear Econometric Analysis, Fall 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].series, as we know from the basic approxima-tion theory. This point is illustrated i n the following Figure, which deals with an earlier housing example, but uses flexible functional form with P (x) g e n e r ated as a cubic spline with ten degr e e s o f freedom. Fle x i b i l i ty is great for this reason, but o f course has its own pr i c e : additional parameters lead to i n c r e ased esti-mation variance. Cite as: Victor Chernozhukov, course materials for 14.385 Nonlinear Econometric Analysis, Fall 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].Flexibly Pedicted Probabilities of Owning a House Prediction0.0 0.2 0.4 0.6 0.8 1.0 Normal Cauchy Logit Linear 0.0 0.2 0.4 0.6 0.8 1.0 Normal Prediction Discussion: Choice of the r i g h t mod e l is a hard and very important problem is statistical analysis. Using flexible links, e.g. t-link vs. probit link, comes at a cost of additional pa-rameters. Using flexible exp an sio n s inside the links also r e q u i r e s ad d i tio n al p arameters. Flex -ibility reduces the approximation error (bias), Cite as: Victor Chernozhukov, course materials for 14.385 Nonlinear Econometric Analysis, Fall 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].but typical l y increases estimation varianc e . Th u s an optimal choice has to balance these terms. A useful devi c e for choosin g best performing models is cross-validation. Reading: A very nic e reference is R. Koenker and J. Yoon (2006) who provide a systematic treatment o f the l i n ks, beyond logits and pro-bits, with an application to propensity score matching. The estimates plo tted in the fig-ures were produced using R language’s pack-age glm. The Cauchy, Gosset, and othe r links for this package were implemented by Koenker and Yoon (2006). References: Koenker, R. and J. Yoon (2006), “Parametric Links for Binary Response Models.” Cite as: Victor Chernozhukov, course materials for 14.385 Nonlinear Econometric Analysis, Fall 2007. MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. (http://www.econ.uiuc.edu/~roger/research/links/links.html)�������1. Extremum Consistency Extremum estimator θˆ= arg min Q(θ). θ∈Θ As we have seen in
View Full Document