MIT OpenCourseWare http://ocw.mit.edu 18.443 Statistics for Applications Spring 2009 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.orem 3. If X1, ..., Xn are i.i.d. N(µ, σ ), n ≥ 2, then (a) X and s2 X are independent random variables; (b) (n − 1)s2 /σ2 ha a χ2X s (n −1) distribution. of. Let Yj = Xj − µ for j = 1, . . . , n. Then Y = X − µ and s2 = s2 Y X . So we can me µ = 0. It’s convenient t o ma ke a rotation of coordinates in n-space. Let the standard basis ors be δni = {δij }j=1 where δij = 1 for i = j and 0 for i = j. The first element of the basis will be e1 = (1/√n, . . . , 1/√n). This does have length 1. Then we can always further orthonormal basis vectors e2, ..., en, for example e2 = (1/√2, −1/√2 √ √ √, 0, . . . , 0), (1/ 6, 1/ 6, −2/ 6, 0, . . . , 0), etc. For any two vectors x = (x1, . . . , xn) and y = (y1, . . . , yn) (with respect to the n dard basis) we have the usual dot product x · y = �j=1 xj yj , with the sq uared length given by |x|2 = x · x. 1 18.443 FACTS ABOUT NORMAL DISTRIBUTIONS AND SAMPLE STATISTICS First, here is a known fact about normal distributions. Theorem 1. If X and Y are independent random variables wi th normal distributions, X ∼ N(µ, σ2) and Y ∼ N(ν, τ2) then X +Y is also normal, with X +Y ∼ N(µ+ν, σ2 +τ2). This is proved in the “addnormals.pdf” handout posted on the course website. Paper copies aren’t being distributed in class because we assume many of you know this fact from a probability course. The next fact is stated early in Section 6.3 of Rice, p. 195. For any X1, ..., Xn, X is defined as the sample mean X := (X1 + + Xn)/n.··· Theorem 2. Let X1, . . . , Xn be i.i.d. N(µ, σ2). Then X ∼ N(µ, σ2/n). Proof. For any distribution F having finite mean µ a nd variance σ2, if X1, . . . , Xn are i.i.d. (F ), t hen X has mean µ and variance σ2/n. So the only problem is t o show that X has a normal distribution in t his case. Now, Sn defined a s X1 + + Xn has a ··· normal distribution, specifically N (nµ, nσ2), by Theorem 1 and induction. Multiplyi ng by a constant 1/n gives X which then has the stated distribution, Q.E.D. In statistics, the mean µ and variance σ2 of a distribution may be unknown and can be estimated from the data by the sample mean X and sample variance n S2 = s 2 =1 �(Xj − X)2 ,X n − 1 j=1 defined for n ≥ 2, respectively. Here S2 is Rice’s notatio n and s2 X is my preferred not ation. Scientific calculators o ften use sx (“sample standard deviation”) whose square is the sample variance s2 X . The next fact includes Corollary A and Theorem B in Section 6.3 of Rice. It gives the distribution of s2 (depending on σ2) and its independence of X in the normal case. Rice X uses the notation S2 instead of sX 2 . The 2Proassuvect 6newfinde3 =stanof x− 1) distribution, proving (b), Q.E.D. Here is another way of looking at chi-squared distributions. As noted in thf, if X1, ..., Xd are i.i.d. N(0, 1), their joint density is (2π)d/2 −exp( x2/2nsional space. Let Y = X2 ··2−| |1+ · + Xd , so that Y has a χ2(d) distribution. ≤ t) = 0 for t ≤ 0. For t > 0, P ( Y ≤t) is the integral of the density over the e |x|2 ≤ t, or equivalently |x �√≤ |√|| t. Suppose d ≥ 2. Using spherical coortntegral becomes Ad(2π)d/2 1 −rd−exp(−r2/2)dr where A tant de0 d is a cons, the (d − 1)-dimensional surface area of the unit sphere |x| = 1 in d-space. titution x = r2 , r = √x, dr = dx/(2√x), the integral b ecomes t A (2π)d/2 �x(d 2)/2 d− −exp(0−x/2)dx/2. Now , for the random vector X = (X1, . . . , Xn) we have X = X e1/√n, and · (X, . . . , X) = (X e1)e1, which is the projection of X to the e1 axis. The lengths of · vectors and their dot products are preserved by rotations of coordinates, so n n �(Xj −X)2 = |X − (X · e1)e1| 2 = �(X · ei)2 . j=1 i=2 yRecall that exp(y) is a notati on for e . Since X1, . . . , Xn are i.i.d. N(0, σ2), t heir joint density is 2(σ√2π)−nΠnj=1 exp(−xj /(2σ2)) = (σ√2π)−n exp(−|x| 2/(2σ2)). This distribution is invariant under any rotation o f coordinates (change of ort honormal basis), specifically |x|2 = (x · e1)2 + (x · e2)2 + ··· (x · en)2 . Thus X · e1, . . . , X · en are i.i.d. N(0, σ2) and X ei/σ are i.i.d. N( 0, 1). It follows that X = X e1/√n is independent of 2 · n 2 · �n s = (n − 1)−1 �(X ei)2, proving (a). Also, (n − 1)s /σ2 = (X ei)2/σ2 has a Xi=2Xi=2· ·χ2(ne above proo ) o n d-dime We have P (Y region wher dinates, the ipending on d By the subsSince (d − 2)/d = (d/2) − 1, and a probability density has a unique normalizing con-stant, this gives another proof that the χ2(d) distribution is the Γ(d/2, 1/2) distribution. Moreover, since we know that the normalizing constant is (1/2)d/2/Γ(d/2), we can evalu-ate Ad = 2πd/2/Γ(d/2). For example, if d = 2, since Γ(1) = 0! = 1, we get A2 = 2π, the circumference of the unit circle as desired. If d = 3, then by the recursion for-mula, Γ(3/2) = Γ(1/2)/2 = √π/2, so A3 = 4π, which is in fact the area of the unit sphere in 3 dimensions. Also, the volume of the unit ball { x 1} in d dimensions is � 1 | | ≤Vd = Ad 0 rd−1dr = Ad/d = πd/2/Γ((d/2) + 1), giving V2 = π and V3 = 4π/3 as desired.
View Full Document