New version page

MIT 18 466 - Lecture Notes

Documents in this Course
Load more
Upgrade to remove ads
Upgrade to remove ads
Unformatted text preview:

March 22, 2003 2.3 Minimal sufficiency and the Lehmann-Scheff´e property.If a statistic T ,for example a real-valued statistic, is sufficient for a family P of laws, then for any other statistic U , say with values in Rk ,the statistic(T, U )withvalues in Rk+1 is also sufficient. In terms of σ-algebras, if the family P is defined on a σ-algebra B and a sub-σ-algebra A is sufficient for P, then any other σ-algebra C with A⊂C⊂Bis also sufficient. But since the idea of sufficiency is data reduction, one would like to have a sufficient σ-algebra as small as possible, or a sufficient statistic of dimension as small as possible. A σ-algebra A will be called minimal sufficient for P if it is sufficient and for any sufficient σ-algebra C,and each A ∈A,there is a C ∈C such that 1A =1C a.s. for each P ∈P.So, A is included in C up to almost sure equality of sets. Then, a statistic T with values in a measurable space (Y, F) will be called minimal sufficient iff T −1(F)isa minimal sufficient σ-algebra. Example.Let P be a family of symmetric laws on R, such as the set of all normal laws N (0,σ2),σ > 0. Considering n = 1 for simplicity, the identity function x is (always) a sufficient statistic, but it is not minimal sufficient in this case, where |x| is also sufficient. For dominated families, a minimal sufficient σ-algebra always exists: 2.3.1 Theorem (Bahadur). Let P be a family of laws on a measurable space (S, B), dominated by a σ-finite measure µ. Then there is always a minimal sufficient σ-algebra A for P.Also, there is such a σ-algebra A containing all sets B in B for which P (B)=0 for all P ∈P,and such an A is unique. Proof. Take a law ν equivalent to P from Lemma 2.1.6(d). Choose densities dP/dν for all P ∈P and let A be the smallest σ-algebra for which all the dP/dν are measurable. Then by Theorem 2.1.4, A is sufficient. Next, let C be any sufficient σ-algebra for P.Let A1 be the collection of sets A in A for which there exists a C ∈C with 1C =1A a.s. for every P ∈P.Then A1 is a σ-algebra, since if 1A =1C a.s. for all P ∈P, the same is true for the complements, with 1 − 1A =1 − 1C ,and if 1A(j) =1C(j) a.s. for all P the same is true for the union of the sequences A(j)and C(j). By the proof of Theorem 2.1.4, (c) implies (b), each dP/dν must equal a C-measurable function a.s. (ν). Thus the sets {dP/dν > t}for each P ∈P and real number t are in A1. Since these sets generate A (RAP, Theorem 4.1.6), A1 = A and A is minimal sufficient. By choice of ν, the collection Z of sets B (in B)with P (B) = 0 for all P ∈P is the same as {B ∈B: ν(B)=0}.The σ-algebra Y generated by Z and A is easily seen to be minimal sufficient. If we start with any other minimal sufficient σ-algebra C in place of A, it follows easily from the minimal sufficiency of both A and C that the resulting Y will be the same. So Y is uniquely determined. � The σ-algebra Y just treated may be called “the minimal sufficient σ-algebra,” al-though as a collection of sets it is actually the largest of all minimal sufficient σ-algebras. An idea closely related to minimal sufficiency is the Lehmann-Scheff´e property, as follows: 1� � � � Definition. Given a collection P of laws on a measurable space (S, B), a sub-σ-algebra A ⊂B will be called a Lehmann-Scheff´e(LS) σ-algebra for P iff whenever f is an A-measurable function with ∫ fdP =0for all P ∈P,we have f =0a.s. forall P ∈P. A statistic will be called an LS statistic for P iff the smallest σ-algebra for which it is measurable is LS for P. Lehmann and Scheff´e called σ-algebras satisfying their property complete.This is different from the notion of complete class of decision rules. Also, in measure theory, a σ-algebra S may be called complete for a measure µ if it contains all subsets of sets of µ-measure 0. The Lehmann-Scheff´e property is, evidently, quite different. So, it seemed appropriate to name it here after its authors. It is equivalent to uniqueness of A-measurable unbiased estimators: 2.3.2 Theorem. A sub-σ-algebra A is LS for P if and only if for every real-valued function g on P having an unbiased A-measurable estimator, the estimator is unique up to equality a.s. for all P ∈P. Proof. The constant function 0 always trivially has an unbiased estimator by the statistic which is identically 0 (and so measurable for any A). Uniqueness of this estimator up to equality a.s. for all P ∈P yields the definition of the LS property. Conversely if A is LS for P, suppose T and U are both A-measurable and both unbiased estimators of a function g on P.Then T − U has integral 0 for all P ∈P,so T − U = 0 a.s. and T = U a.s. for all P ∈P. � Some σ-algebras are LS just because they are small. For example, the trivial σ-algebra {∅,S} is always LS. For any measurable set A,the σ-algebra {∅,A,Ac,S} is LS for P unless P (A)isthe same for all P in P.So LS σ-algebras will be interesting only when they are large enough. One useful measure of being large enough is sufficiency. If a function g on P has an unbiased estimator U and A is a sufficient σ-algebra, then T = EP (U |A), which doesn’t depend on P ∈P by Theorem 2.1.8, is an unbiased, A-measurable estimator as in Corollary 2.2.3 and Theorem 2.3.2. From here on, the LS property will be considered for sufficient σ-algebras. These must be minimal sufficient: 2.3.3 Theorem. For any collection P of laws on a measurable space (S, B), any LS, sufficient σ-algebra C is minimal sufficient. Proof. If not, there is a sufficient σ-algebra A and a set C ∈C such that there is no set A in A for which 1C =1A a.s. for all P ∈P.Let f := EP (1C |A) for all P ∈P by Theorem 2.1.8. For some P ∈P, f is not equal to 1C a.s. (P ), otherwise letting A = {f =1} would give a contradiction. We have ∫(1C − f )fdP = 0, as can be seen by taking the conditional expectation of the integrand with respect to A and bringing f outside the conditional expectation by Lemma 2.1.1 as in the proof of Theorem 2.1.4; or, see RAP, Theorem 10.2.9 (conditional expectation is an orthogonal projection in L2). It follows from this orthogonality that P (C)= 12 C dP = (1C − f )2dP + f 2dP >

View Full Document
Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...

Join to view Lecture Notes and access 3M+ class-specific study document.

We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.


By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?