UNC-Chapel Hill BIOS 760 - CHAPTER 5- MAXIMUM LIKELIHOOD ESTIMATION

Unformatted text preview:

CHAPTER 5 MAXIMUM LIKELIHOOD ESTIMATION 1CHAPTER 5: MAXIMUM LIKELIHOODESTIMATIONCHAPTER 5 MAXIMUM LIKELIHOOD ESTIMATION 2Introduction to Efficient EstimationCHAPTER 5 MAXIMUM LIKELIHOOD ESTIMATION 3• GoalMLE is asymptotically efficient estimator under someregularity conditions.CHAPTER 5 MAXIMUM LIKELIHOOD ESTIMATION 4• Basic settingSuppose X1, ..., Xnare i.i.d from Pθ0in the model P.(A0). θ = θ∗implies Pθ= Pθ∗(identifiability).(A1). Pθhas a density function pθwith respect to adominating σ-finite measure µ.(A2). The set {x : pθ(x) > 0} does not depend on θ.CHAPTER 5 MAXIMUM LIKELIHOOD ESTIMATION 5• MLE definitionLn(θ) =n∏i=1pθ(Xi), ln(θ) =n∑i=1log pθ(Xi).Ln(θ) and ln(θ) are called the likelihood function and thelog-likelihood function of θ, respectively.An estimatorˆθnof θ0is the maximum likelihoodestimator (MLE) of θ0if it maximizes the likelihoodfunction Ln(θ).CHAPTER 5 MAXIMUM LIKELIHOOD ESTIMATION 6Ad Hoc ArgumentsCHAPTER 5 MAXIMUM LIKELIHOOD ESTIMATION 7√n(ˆθn− θ0) →dN(0, I(θ0)−1)– Consistency:ˆθn→ θ0(no asymptotic bias)– Efficiency: asymptotic variance attains efficiencybound I(θ0)−1.CHAPTER 5 MAXIMUM LIKELIHOOD ESTIMATION 8• ConsistencyDefinition 5.1 Let P be a probability measure and letQ be another measure on (Ω, A) with densities p and qwith respect to a σ-finite measure µ (µ = P + Q alwaysworks). P (Ω) = 1 and Q(Ω) ≤ 1. Then theKullback-Leibler information K(P, Q) isK(P, Q) = EP[logp(X)q(X)].CHAPTER 5 MAXIMUM LIKELIHOOD ESTIMATION 9Proposition 5.1 K(P, Q) is well-defined, andK(P, Q) ≥ 0. K(P, Q) = 0 if and only if P = Q.ProofBy the Jensen’s inequality,K(P, Q) = EP[−logq(X)p(X)] ≥ −log EP[q(X)p(X)] = −log Q(Ω) ≥ 0.The equality holds if and only if p(x) = Mq( x) almost surely withrespect P and Q(Ω) = 1⇒ P = Q.CHAPTER 5 MAXIMUM LIKELIHOOD ESTIMATION 10• Why is MLE consistent?ˆθnmaximizes ln(θ),1nn∑i=1pˆθn(Xi) ≥1nn∑i=1pθ0(Xi).Supposeˆθn→ θ∗. Then we would expect to the bothsides converge toEθ0[pθ∗(X)] ≥ Eθ0[pθ0(X)],which implies K(Pθ0, Pθ∗) ≤ 0.From Prop. 5.1, Pθ0= Pθ∗. From A0, θ∗= θ0. That is,ˆθnconverges to θ0.CHAPTER 5 MAXIMUM LIKELIHOOD ESTIMATION 11• Why is MLE efficient?Supposeˆθn→ θ0.ˆθnsolves the following likelihood (orscore) equations˙ln(ˆθn) =n∑i=1˙lˆθn(Xi) = 0.Taylor expansion at θ0:−n∑i=1˙lθ0(Xi) = −n∑i=1¨lθ∗(Xi)(ˆθ − θ0),where θ∗is between θ0andˆθ.√n(ˆθ − θ0) = −1√n{n−1n∑i=1¨lθ∗(Xi)}{n∑i=1˙lθ0(Xi)}.CHAPTER 5 MAXIMUM LIKELIHOOD ESTIMATION 12√n(ˆθn− θ0) is asymptotically equivalent to1√nn∑i=1I(θ0)−1˙lθ0(Xi).Thenˆθnis an asymptotically linear estimator of θ0withthe influence function I(θ0)−1˙lθ0=˜l(·, Pθ0|θ, P).CHAPTER 5 MAXIMUM LIKELIHOOD ESTIMATION 13Consistency ResultsCHAPTER 5 MAXIMUM LIKELIHOOD ESTIMATION 14Theorem 5.1 Consistency with dominatingfunctionSuppose that(a) Θ is compact.(b) log pθ(x) is continuous in θ for all x.(c) There exists a function F (x) such thatEθ0[F (X)] < ∞ and |log pθ(x)| ≤ F (x) for all x and θ.Thenˆθn→a.s.θ0.CHAPTER 5 MAXIMUM LIKELIHOOD ESTIMATION 15ProofFor any sample ω ∈ Ω,ˆθnis compact. By choosing a subsequence,ˆθn→ θ∗.If1n∑ni=1lˆθn(Xi) → Eθ0[lθ∗(X)], then since1nn∑i=1lˆθn(Xi) ≥1nn∑i=1lθ0(Xi),⇒ Eθ0[lθ∗(X)] ≥ Eθ0[lθ0(X)].⇒ θ∗= θ0. Done!It remains to show Pn[lˆθn(X)] ≡1n∑ni=1lˆθn(Xi) → Eθ0[lθ∗(X)].It suffices to show|Pn[lˆθ(X)] − Eθ0[lˆθ(X)]| → 0.CHAPTER 5 MAXIMUM LIKELIHOOD ESTIMATION 16We can even prove the following uniform convergence resultsupθ∈Θ|Pn[lθ(X)] − Eθ0[lθ(X)]| → 0.Defineψ(x, θ, ρ) = sup|θ′−θ|<ρ(lθ′(x) − Eθ0[lθ′(X)]).Since lθis continuous, ψ(x, θ, ρ) is measurable and by the DCT,Eθ0[ψ(X, θ, ρ)] decreases to Eθ0[lθ(x) − Eθ0[lθ(X)]] = 0.⇒ for ϵ > 0, for any θ ∈ Θ, there exists a ρθsuch thatEθ0[ψ(X, θ, ρθ)] < ϵ.CHAPTER 5 MAXIMUM LIKELIHOOD ESTIMATION 17The union of {θ′: |θ′− θ| < ρθ} covers Θ. By the compactness ofΘ, there exists a finite number of θ1, ..., θmsuch thatΘ ⊂ ∪mi=1{θ′: |θ′− θi| < ρθi}.⇒supθ∈Θ{Pn[lθ(X)] − Eθ0[lθ(X)]} ≤ sup1≤i≤mPn[ψ(X, θi, ρθi)].lim supnsupθ∈Θ{Pn[lθ(X)] − Eθ0[lθ(X)]} ≤ sup1≤i≤mPθ[ψ(X, θi, ρθi)] ≤ ϵ.⇒ lim supnsupθ∈Θ{Pn[lθ(X)] − Eθ0[lθ(X)]} ≤ 0. Similarly,lim supnsupθ∈Θ{−Pn[lθ(X)] + Eθ0[lθ(X)]} ≥ 0.⇒limnsupθ∈Θ|Pn[lθ(X)] − Eθ0[lθ(X)]| → 0.CHAPTER 5 MAXIMUM LIKELIHOOD ESTIMATION 18Theorem 5.2 Wald’s Consistency Θ is compact.Suppose θ 7→ lθ(x) = log pθ(x) is upper-semicontinuousfor all x, in the sense lim supθ′→θlθ′(x) ≤ lθ(x). Supposefor every sufficient small ball U ⊂ Θ,Eθ0[supθ′∈Ulθ′(X)] < ∞. Thenˆθn→pθ0.CHAPTER 5 MAXIMUM LIKELIHOOD ESTIMATION 19ProofEθ0[lθ0(X)] > Eθ0[lθ′(X)] for any θ′= θ0⇒ there exists a ball Uθ′containing θ′such thatEθ0[lθ0(X)] > Eθ0[ supθ∗∈Uθ′lθ∗(X)].Otherwise, there exists a sequence θ∗m→ θ′butEθ0[lθ0(X)] ≤ Eθ0[lθ∗m(X)]. Since lθ∗m(x) ≤ supU′lθ′(X) where U′isthe ball satisfying the condition,lim supmEθ0[lθ∗m(X)] ≤ Eθ0[lim supmlθ∗m(X)] ≤ Eθ0[lθ′(X)].⇒ Eθ0[lθ0(X)] ≤ Eθ0[lθ′(X)] contradiction!CHAPTER 5 MAXIMUM LIKELIHOOD ESTIMATION 20For any ϵ, the balls ∪θ′Uθ′covers the compact set Θ ∩{|θ′−θ0| > ϵ}⇒ there exists a finite covering balls, U1, ..., Um.P (|ˆθn− θ0| > ϵ) ≤ P ( sup|θ′−θ0|>ϵPn[lθ′(X)] ≥ Pn[lθ0(X)])≤ P ( max1≤i≤mPn[ supθ′∈Uilθ′(X)] ≥ Pn[lθ0(X)])≤m∑i=1P (Pn[ supθ′∈Uilθ′(X)] ≥ Pn[lθ0(X)]).SincePn[ supθ′∈Uilθ′(X)] →a.s.Eθ0[ supθ′∈Uilθ′(X)] < Eθ0[lθ0(X)],the right-hand side converges to zero. ⇒ˆθn→pθ0.CHAPTER 5 MAXIMUM LIKELIHOOD ESTIMATION 21Asymptotic Efficiency ResultCHAPTER 5 MAXIMUM LIKELIHOOD ESTIMATION 22Theorem 5.3 Suppose that the model P = {Pθ: θ ∈ Θ}is Hellinger differentiable at an inner point θ0of Θ ⊂ Rk.Furthermore, suppose that there exists a measurablefunction F with Eθ0[F2] < ∞ such that for every θ1andθ2in a neighborhood of θ0,|log pθ1(x) − log pθ2(x)| ≤ F (x)|θ1− θ2|.If the Fisher information matrix I(θ0) is nonsingular andˆθnis consistent, then√n(ˆθn− θ0) =1√nn∑i=1I(θ0)−1˙lθ0(Xi)


View Full Document

UNC-Chapel Hill BIOS 760 - CHAPTER 5- MAXIMUM LIKELIHOOD ESTIMATION

Documents in this Course
Load more
Download CHAPTER 5- MAXIMUM LIKELIHOOD ESTIMATION
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view CHAPTER 5- MAXIMUM LIKELIHOOD ESTIMATION and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view CHAPTER 5- MAXIMUM LIKELIHOOD ESTIMATION 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?