MIT 18 443 - MIT18_443s09_assn04_res01 - D2619826

Home> Schools> Massachusetts Institute of Technology> (18) > 18 443> MIT18_443s09_assn04_res01

MIT 18 443 - MIT18_443s09_assn04_res01

School name Massachusetts Institute of Technology

Course 18 443- Statistics for Applications

Pages 3

Download Save

Unformatted text preview:

MIT OpenCourseWare http://ocw.mit.edu 18.443 Statistics for Applications Spring 2009 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.� � � 18.443 Bayes least-squares estimat ion In the lecture on Monday the 9th I g ave material on Bayes estimation that I haven’t found in Rice, so here is a printed form of it. First here is a very simple fact. Proposition 1. For any random variable X with E(X2) < +∞, the unique constant c that minimizes E((X − c)2) is c = EX. Proof. E((X − c)2) = E(X2) − 2cEX + c2 is a quadratic polynomial in c which goes to +∞ as c → ±∞, so it’s minimized where its derivative with respect to c equal s 0, namely at c = EX, Q.E.D. Now suppose given a prior density π(θ) for a continuous parameter θ, so π(θ ) > 0 and ∞ π( θ)dθ = 1. (If θ is m-mult idimensional we would need an m-fold multiple int egra l ∞ but nothing essential in the following would be changed, so I’ll write a s if m = 1.) Let f(x, θ) be a likeliho od function for one observation, which may be either a probability mass function if x is discrete or a density function if x is continuous. If we have i.i.d. observations X = (X1, ..., Xn) we get a likeli hood function f(X, θ) = �n f(Xj, θ). The j=1 posterior density πX(θ) is gotten by multiplying the likelihood function by the prior and then normalizing it, f(X, θ)π(θ)πX(θ) = � ∞ . f(X, φ)π(φ)dφ −∞ Suppose we want to estimate a function g(θ). Then for an estimator V (X), the mean-square error (MSE) is Eθ[(V (X) − g(θ))2]. For a prior π, the risk is the expectation of the MSE wi th respect to that prio r, namely ∞ (1) Eθ[(V (X) − g(θ))2]π(θ)dθ. −∞ A Bayes estimator for g(θ) for the given prior is one that minimizes the risk, provided it s risk is ﬁnite. Theorem 2. For a given likeliho od function f(X, θ) and prior density π, if there exists some estimator U (X) of the given g( θ) that has ﬁnite risk for t he given π, then there exists a Bayes estimator, given by the expectation of g(θ) with respect to the posterior distribution, ∞ (2) T (X) = g(θ)πX(θ)dθ. −∞ The Bayes estima tor is essenti ally unique, in the sense that any Bayes estimator must equal this T (X) with probability 1. 1� � Proof. We would like to minimize (1). Let’s write out the Eθ. I’ll use � · · · dX as a shorthand for � � �∞ ∞ ∞ · · · · · · dX1 dX2 · · · dXn, −∞ −∞ −∞ where the integral(s) are replaced by sums in case X is discrete. The method of proof is essentially the same. Then (1) becomes ∞ (3) [(V (X) − g( θ))2]f (X, θ) dX π( θ)dθ. −∞ Since the integrand is nonnegative and the integrals are well-deﬁned (possibly inﬁnite) we can i nterchange the two integrals, and (3) becomes � � ∞ (4) [(V (X) − g( θ))2]f (X, θ)π(θ)dθdX. −∞ ∞We can further write f (X, θ)π(θ) = πX(θ)� f(X, φ)dφ by deﬁnition of the posterior −∞ density πX, and the latter factor doesn’t depend on θ so we can t ake it outside the integral with respect to θ. Then (4) becomes � � �∞ ∞ (5) [(V (X) − g( θ))2]πX(θ)dθ f (X, φ)dφdX. −∞ −∞ In the inner integral with respect to θ in (5), X is ﬁxed and g(θ) is a random variable with respect to the posterior density πX(θ). To minimize this inner i ntegral we need to choose V (X) , which would be constant for ﬁxed X. By Proposition 1, the correct constant is given by V (X) = T (X) in (2). Since the ri sk is ﬁnite for some estimator by assumption, the minimum risk must be ﬁnite, so T (X) in (2) indeed gives a Bayes estimator. The essential uniqueness foll ows from the uniqueness in Proposition 1. Q.E.D. As mentioned in the lecture 3/9, if θ = λ is a Poisson parameter and the prior π is a gamma density, then the posterior is also in the gamma family and the expectation for it is known. The same occurred using a beta prior for t he binomial p in a problem set problem. Some texts give a diﬀerent formulati on of Theorem 2 in which they say that t he Bayes estimator is the conditional expectation of g(θ) given X, T (X) = E(g( θ)|X). That is cor-rect in case � |g(θ)|π(θ)dθ < +∞ but integrals with respect to the posterior distributions may be ﬁnite even if they are not with resp ect to the prior. There is more about that in Section 2.6 of the 18.466 OCW notes, but that section is far from self-contained, so otherwise it isn’t recommended reading in this course. REFERENCES Dudley, R. M. (2003). Mathematical Statistics, 18.466 lecture notes, Spring 2003. On MIT OCW (OpenCourseWare) website, 2004.

View Full Document


School:
Email:
New Password:
Confirm Password:

MIT 18 443 - MIT18_443s09_assn04_res01

Sign up for free to view:

Please select your school