New version page

SMU STAT 6380 - Study Notes

Pages: 4
Documents in this Course

This preview shows page 1 out of 4 pages.

View Full Document

End of preview. Want to read all 4 pages?

View Full Document
Unformatted text preview:

Outline for Class meeting 16 (Chapter 6, Lohr, 4/3/06)Unequal probability, With Replacement Sampling I. (Semi-review) One-stage samplingWe may want to give every unit in the population its own probability of selection.The units can be individual observations or psu's.A. Example: Library dataLet yi = # of inquiries (variable of interest); let xi = # of circulated itemsSuppose I sample with replacement, and with probability proportional to xi; i.e., letNjjiixx1at each draw. I make n draws in all, but don't necessarily get n different units. (Why dowe sample with replacement ?)B. How do I implement such a design?One way is by cumulative size method:Step 1 Compute x1, x1+ x2, …, x1 + x2 +…+ xN = txStep 2 Select a random # between 1 and tx, say riStep 3 If ijjijijxrx1111, then select unit i into the sample. Step 4. Repeat Steps 1 - 3 n times.II. Estimators for one-stage samplingA. Define sample]in selected isunit ith Pr[iAnddraw]each on selected isunit ith Pr[iThe two estimators are1. The Horvitz-Thomson Estimator (rarely used) siiisiiiHTyyt2)1(1ˆ2. Another estimator (usually used)niyniit11ˆB. Properties of tˆ1. It is unbiased.2. Its variance is 211tiiyNiin(*)3. Its variance can be estimated byniynnttvii12111)ˆ()ˆ((**)C. Proving properties of tˆWhen you do sampling with replacement, you are back into the iid world of math stat.Define a random variable iiyiZ with probability i.Observe this random variable n times.Thinking of the sampling design in this way allows you to use the machinery of iidmathematical statistics. 1. We know tyNiiNiiyzii 112. We know that the sample mean tzniyniiˆ11is an unbiased estimator of z.3. We know that the variance of the sample mean is 2112121)(][)(zNiiinzinnzZEzV  = (*).4. We can estimate the variance of the sample mean by (**))()(121112niinnnszzzvIII. What probability of selection should be used?A. The variance can be driven to 0 if iiy(!) So goal is to pick probabilitiesproportional to some characteristic that is as highly correlated as possible to yi.B. When sample units are psu's, the probability is chosen proportional to size of the psu.Called probability-proportional-to-size (pps) sampling. Also called dollar unit samplingin accounting.IV. Two-stage samplingSometimes psu’s are selected with replacement with unequal probablility, but then ssu’swithin the psu are subsampled. A. Estimator1. As before, definedraw]each on selected is (psu)unit ith Pr[iThe most commonly used estimator is tˆ. It is similar to the estimator for one-stagedesigns of the same notation (we’ve run out of notation apparently!). Since we cannotobserve ti, we must estimate it. Any kind of sample design we wish can be used withinthe psu to estimate this total, but it must be determined in advance and must be the samedesign each time the psu is sampled. Then the two-stage estimator isnitniit1ˆ1ˆ.Note that the same psu can be sampled more than once, and unlike the one-stage case,each time it enters the sample its estimated total may be different, because a differentsample may be chosen.(Note: Lohr writes this estimator differently, using the Qi notation, where Qi = # of timespsu i is chosen into the sample. But the estimator she writes is the same as ours.)B. Properties of tˆ1. It is unbiased. (You should be able to prove this.)2. Its variance is NiVnyNiiniiiit11211where )|ˆ( itVarViiThis is pretty tedious to prove, but uses the ideas we discussed inclass last week. 3. The good news is that its variance can be estimated by2112111)ˆ()ˆ( sttvnnitnniiwhere s2 is the same as in the one-stage case. It is just the sample variance of the n estimates of psu total iit/ˆ.C. ExampleTelephone surveys are often done using a two-stage, unequal probability with replacement design. See the steps on p. 200 of your text.This is called the Waksberg-Mitofsky method. For this design, KMii/, where K is the number of residential numbers in the universe. The cleverness of this design is that it is not necessary to know i in order to calculate the estimate if the same number of phone #’s are sampled from each

View Full Document Unlocking...