The Normal DistributionWhen can we use the simple mean and SD to summarize a list of numbers? One time is whenthe data are approximately normal. By this, we mean that the distribution looks roughlylike the normal curve. First a short review of the normal curve.Normal curve The standard Normal curve is1√2πe−12x2It is typically represented as φ. It is symmetric about 0 and has inflection points at +1 and-1. The area under the normal curve from −∞ to z is expressed as Φ(z):Φ(z) =Zz−∞φ(x)dx1. Show that the area below −z (take z > 0) under the normal curve is the same as the areaabove z under the curve. That is, show that Φ(−z) = 1 − Φ(z).2. Show that the area between −z and z (for z > 0) under the normal curve can be expressedas Φ(z) − Φ(−z). This area is approximately .68 when z = 1 and .95 when z = 2.3. Show that the area between −z and z (for z > 0) under the normal curve can be expressedas 2Φ(z) − 1, or equivalently 1 − 2Φ(−z).The Normal(µ, σ2) curve has the same shape as the standard normal curve. However, it issymmetric about µ with points of inflection at µ + σ and µ −σ. The equation for this curveis:1√2πσe−12(x−µ)2/σ2The area between a and b under the N(µ, σ2) can be found by using the standard normalcurve. In particular, the area between (a −µ)/σ and (b −µ)/σ under the standard normal,i.e.Φ(b −µσ) −Φ(a −µσ)is the same as the area between a and b under the N(µ, σ2). This means that we can convertto standard units to find areas under normal curves.4. Show by a change of variables that the area between a and b is Φ(b−µσ) −Φ(a−µσ).Normal Approximation We often find data that occurs in nature, such as the heightsof adult men, follow the normal distribution. Adolphe Quetelet (1796-1874) was one of thefirst scientist to recognize this phenomenon, and he went about fitting normal curves to lotsof different datasets to prove his point.On the accompanying page, you w ill find data to which Quetelet fitted the normal distri-bution. The table gives the chest circumferences of Scottish soldiers (in inches) taken fromStigler’s History of Statistics.The mean chest-size if 40.5 inches and the SD is 2.0 inches.One way to fit the normal distribution involves the 68-95-99% rule. That is, 68% of the datashould be within one SD of the mean, 95% within 2SDs, and 99% within 3SDs, if the dataare roughly normal. Check the 68-95-99% rule for the Scottish soldiers chest sizes.Normal Quantiles Another, better, way to check the appropriateness of the normal ap-proximation, is to compare the quantiles of the data to that of the normal distribution.Consider our data, x1, . . . , xn. Let xqrepresent the qth sample quantile. That is, at least nqof the xiare less than or equal to xqand at least n(1 −q) are greater than or equal to xq.The sample cumulative distribution function, also called the empirical cumulativedistribution function, gives us the quantiles. It is the function, Fn:Fn(xq) = q.Well, the inverse of this function gives us the quantiles,F−1n(q) = xq.Note that Φ is the cumulative distribution f unction for the s tandard normal, and zqare thestandard normal quantiles. Fill in the table with the standard normal quantiles:zq-2 -1 0 1 2qIf we plot the pairs of points (zq, xq), i.e. plot the standard normal quantiles against thesample quantiles, then if the data are approximately normal, the points should fall on a line.Why? Because Fn≈ Φ, soF−1n(q) ≈ Φ−1(q)xq≈ zqWhat if Fnis approximately normal, but not standard normal?What if Fnisn’t normal? Take a look at the pictures in your
View Full Document