March 22, 2003 2.3 Minimal sufficiency and the Lehmann-Scheff´e property.If a statistic T ,for example a real-valued statistic, is sufficient for a family P of laws, then for any other statistic U , say with values in Rk ,the statistic(T, U )withvalues in Rk+1 is also sufficient. In terms of σ-algebras, if the family P is defined on a σ-algebra B and a sub-σ-algebra A is sufficient for P, then any other σ-algebra C with A⊂C⊂Bis also sufficient. But since the idea of sufficiency is data reduction, one would like to have a sufficient σ-algebra as small as possible, or a sufficient statistic of dimension as small as possible. A σ-algebra A will be called minimal sufficient for P if it is sufficient and for any sufficient σ-algebra C,and each A ∈A,there is a C ∈C such that 1A =1C a.s. for each P ∈P.So, A is included in C up to almost sure equality of sets. Then, a statistic T with values in a measurable space (Y, F) will be called minimal sufficient iff T −1(F)isa minimal sufficient σ-algebra. Example.Let P be a family of symmetric laws on R, such as the set of all normal laws N (0,σ2),σ > 0. Considering n = 1 for simplicity, the identity function x is (always) a sufficient statistic, but it is not minimal sufficient in this case, where |x| is also sufficient. For dominated families, a minimal sufficient σ-algebra always exists: 2.3.1 Theorem (Bahadur). Let P be a family of laws on a measurable space (S, B), dominated by a σ-finite measure µ. Then there is always a minimal sufficient σ-algebra A for P.Also, there is such a σ-algebra A containing all sets B in B for which P (B)=0 for all P ∈P,and such an A is unique. Proof. Take a law ν equivalent to P from Lemma 2.1.6(d). Choose densities dP/dν for all P ∈P and let A be the smallest σ-algebra for which all the dP/dν are measurable. Then by Theorem 2.1.4, A is sufficient. Next, let C be any sufficient σ-algebra for P.Let A1 be the collection of sets A in A for which there exists a C ∈C with 1C =1A a.s. for every P ∈P.Then A1 is a σ-algebra, since if 1A =1C a.s. for all P ∈P, the same is true for the complements, with 1 − 1A =1 − 1C ,and if 1A(j) =1C(j) a.s. for all P the same is true for the union of the sequences A(j)and C(j). By the proof of Theorem 2.1.4, (c) implies (b), each dP/dν must equal a C-measurable function a.s. (ν). Thus the sets {dP/dν > t}for each P ∈P and real number t are in A1. Since these sets generate A (RAP, Theorem 4.1.6), A1 = A and A is minimal sufficient. By choice of ν, the collection Z of sets B (in B)with P (B) = 0 for all P ∈P is the same as {B ∈B: ν(B)=0}.The σ-algebra Y generated by Z and A is easily seen to be minimal sufficient. If we start with any other minimal sufficient σ-algebra C in place of A, it follows easily from the minimal sufficiency of both A and C that the resulting Y will be the same. So Y is uniquely determined. � The σ-algebra Y just treated may be called “the minimal sufficient σ-algebra,” al-though as a collection of sets it is actually the largest of all minimal sufficient σ-algebras. An idea closely related to minimal sufficiency is the Lehmann-Scheff´e property, as follows: 1� � � � Definition. Given a collection P of laws on a measurable space (S, B), a sub-σ-algebra A ⊂B will be called a Lehmann-Scheff´e(LS) σ-algebra for P iff whenever f is an A-measurable function with ∫ fdP =0for all P ∈P,we have f =0a.s. forall P ∈P. A statistic will be called an LS statistic for P iff the smallest σ-algebra for which it is measurable is LS for P. Lehmann and Scheff´e called σ-algebras satisfying their property complete.This is different from the notion of complete class of decision rules. Also, in measure theory, a σ-algebra S may be called complete for a measure µ if it contains all subsets of sets of µ-measure 0. The Lehmann-Scheff´e property is, evidently, quite different. So, it seemed appropriate to name it here after its authors. It is equivalent to uniqueness of A-measurable unbiased estimators: 2.3.2 Theorem. A sub-σ-algebra A is LS for P if and only if for every real-valued function g on P having an unbiased A-measurable estimator, the estimator is unique up to equality a.s. for all P ∈P. Proof. The constant function 0 always trivially has an unbiased estimator by the statistic which is identically 0 (and so measurable for any A). Uniqueness of this estimator up to equality a.s. for all P ∈P yields the definition of the LS property. Conversely if A is LS for P, suppose T and U are both A-measurable and both unbiased estimators of a function g on P.Then T − U has integral 0 for all P ∈P,so T − U = 0 a.s. and T = U a.s. for all P ∈P. � Some σ-algebras are LS just because they are small. For example, the trivial σ-algebra {∅,S} is always LS. For any measurable set A,the σ-algebra {∅,A,Ac,S} is LS for P unless P (A)isthe same for all P in P.So LS σ-algebras will be interesting only when they are large enough. One useful measure of being large enough is sufficiency. If a function g on P has an unbiased estimator U and A is a sufficient σ-algebra, then T = EP (U |A), which doesn’t depend on P ∈P by Theorem 2.1.8, is an unbiased, A-measurable estimator as in Corollary 2.2.3 and Theorem 2.3.2. From here on, the LS property will be considered for sufficient σ-algebras. These must be minimal sufficient: 2.3.3 Theorem. For any collection P of laws on a measurable space (S, B), any LS, sufficient σ-algebra C is minimal sufficient. Proof. If not, there is a sufficient σ-algebra A and a set C ∈C such that there is no set A in A for which 1C =1A a.s. for all P ∈P.Let f := EP (1C |A) for all P ∈P by Theorem 2.1.8. For some P ∈P, f is not equal to 1C a.s. (P ), otherwise letting A = {f =1} would give a contradiction. We have ∫(1C − f )fdP = 0, as can be seen by taking the conditional expectation of the integrand with respect to A and bringing f outside the conditional expectation by Lemma 2.1.1 as in the proof of Theorem 2.1.4; or, see RAP, Theorem 10.2.9 (conditional expectation is an orthogonal projection in L2). It follows from this orthogonality that P (C)= 12 C dP = (1C − f )2dP + f 2dP >
View Full Document