Unformatted text preview:

SAMPLINGBIOS 662Michael G. Hudgens, [email protected]://www.bios.unc.edu/∼mhudgens2007-11-14 16:47BIOS 662 1 SamplingOutline• Preliminaries• Simple random sampling– Population mean– Population total– Sample size– ProportionBIOS 662 2 SamplingPreliminaries: What is “Sampling”?• Sampling study: selecting some part of a population to be observedso that one may estimate something about the whole of the popu-lation• Eg, to estimate the amount of lichen in a well-defined area, a bi-ologist collects lichen from selected small plots within the studyarea• Typically want to estimate total or mean• Observational - does not intentionally perturb or disturb population(i.e., not experimental)• However one does have control over how the sample is selectedBIOS 662 3 SamplingPreliminaries: Terminology• Population: The group of units (e.g., people) we are sampling andstudying. Assumed to be of known, finite size.• Sampling Design: The strategy followed in selecting a samplefrom a population• Sampling Unit: Unit designated for listing and selection in a sam-ple survey (e.g., persons, dwellings, households, area units, phar-macies)• Sampling Frame: List of sampling units from which a sample isdrawn• Variable: Some measurement taken on members of the sample(e.g., number of children ever born to a woman aged 15-49 years);sometimes call this the y-variable or x-variableBIOS 662 4 SamplingPreliminaries: Terminology• Selection probability: Likelihood over repeated applications of sam-pling design of that a particular unit would be chosen• Probability Sampling– Sampling in which the design calls for using random methodsto ultimately decide which units are chosen– Every unit has a known, nonzero selection probability• Equal-Probability Sampling– Probability sampling in which all units in the population has thesame selection probability– AKA “self-weighted” sampling or “epsem” (equal probabilityof selection method) samplingBIOS 662 5 SamplingPreliminaries: Terminology• Non-probability Sampling– Sampling in which subjective judgment (usually by interview-ers) is used to ultimately decide who is chosen in the sample– Selection probabilities cannot be determined– Difficult to determine if sample is representative (i.e., includesmembers from all relevant segments of the population)BIOS 662 6 SamplingPreliminaries: Terminology• Unbiased estimator: An estimator which, if repeated over all possi-ble samples that might be selected using a sampling design, wouldyield estimates which on average equal the parameter being es-timated (e.g., sample mean from a simple random sample is anunbiased estimator of the population mean)• AKA design-unbiased• Key idea: the randomness in the estimator is induced by the sam-pling designBIOS 662 7 SamplingPreliminaries: Software• SAS: Proc Surveymeans, Surveyfreq, ...• R: “survey” packageBIOS 662 8 SamplingPreliminaries: Sampling Designs• Simple Random Sampling• Stratified Sampling• Cluster SamplingBIOS 662 9 SamplingSimple Random Sampling (SRS)• Let N denote the number of units in the population• Simple random sampling, or random sampling without replace-ment, is the sampling design in which n distinct units are selectedfrom the N units in the population in such a way that every pos-sible combination of the n units is equally likely to be the sampleselected (SRSWOR)• SRS sample can be obtained through a sequence of independentselections from the whole population where each unit has an equalprobability of selection at each step, discarding repeat selectionsand continuing until n distinct units are obtained• f ≡ n/N sampling rateBIOS 662 10 SamplingObtaining an SRS sample• A. Number the units in the population (i.e., sampling frame) from1 to N.• B. Select and record a random number between 1 and N.• C. Select a second random number between 1 and N. If this num-ber is the same as the first selected number, discard it. Otherwise,record it.• D. Select another random number between 1 and N. If this numberis the same as a previously selected number, discard it. Otherwise,record it.• E. Continue in this manner until n different numbers between 1 andN have been chosen.• F. Population units corresponding to selected numbers are an SRSsample of size n.BIOS 662 11 SamplingKey properties of SRS• All possible SRS samples have the same chance of being selected.• The probability that any one population unit will be chosen is n/N.• Selection probabilities in an SRS are not statistically independent.BIOS 662 12 SamplingSRS: Estimating population mean• Denote (finite) population mean byµ =1NN∑i=1yi• Denote (finite) population variance byσ2=1N −1N∑i=1(yi−µ)2• Let Ziindicate whether unit i is in sample with Zi= 1 if sampled,Zi= 0 otherwise• Key: yi’s are fixed, Zi’s are randomBIOS 662 13 SamplingSRS: Estimating population mean• Sample mean¯y =1nN∑i=1yiZi• Sample variances2=1n −1N∑i=1(yi− ¯y)2Zi• Sample mean unbiased: Each Ziis Bernoulli with E(Zi) = n/N,thusE(¯y) =1nN∑i=1yiE(Zi) =1NN∑i=1yi= µBIOS 662 14 SamplingSRS: Estimating population mean• To derive variance of sample mean,Var( ¯y) =1n2(N∑i=1y2iVar(Zi) +∑i6= jyiyjCov(Zi,Zj))we need var and cov terms• Variance easy since ZiBernoulliVar(Zi) =nN1 −nN• For SRS, Zi’s not independentCov(Zi,Zj) = E(ZiZj) −E(Zi)E(Zj) =nN(n −1)(N −1)−nN2= −nN1 −nN1N −1BIOS 662 15 SamplingSRS: Estimating population mean• ThusVar( ¯y) =1n2nN1 −nN(N∑i=1y2i−1N −1∑i6= jyiyj)• Using the identityN∑i=1(yi−µ)2=N∑i=1y2i−(∑yi)2N=1N((N −1)N∑i=1y2i−∑i6= jyiyj)we getVar( ¯y) =1n1 −nN∑(yi−µ)2N −1=1 −nNσ2nBIOS 662 16 SamplingSRS: Estimating population mean• The quantity1 −nN=N −nNis called the finite population correction factor• If the population is large relative to the sample size, n/N will besmall, such thatVar( ¯y) ≈σ2n• On the other hand, Var( ¯y) → 0 as n → NBIOS 662 17 SamplingSRS: Estimating population variance• Homework problem? Show E(s2) = σ2, i.e., the sample varianceis an unbiased estimator of the finite population variance• From this fact, it follows that an unbiased estimator of Var( ¯y) isgiven bydVar( ¯y) =1 −nNs2nBIOS 662 18 SamplingSRS: Estimating population total• Define population totalτ =N∑i=1yi= Nµ• Unbiased estimatorˆτ = N ¯y


View Full Document

UNC-Chapel Hill BIOS 662 - Sampling

Download Sampling
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Sampling and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Sampling 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?