DOC PREVIEW
UNC-Chapel Hill BIOS 662 - Sampling II

This preview shows page 1-2-15-16-31-32 out of 32 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 32 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 32 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 32 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 32 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 32 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 32 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 32 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

SAMPLING IIBIOS 662Michael G. Hudgens, [email protected]://www.bios.unc.edu/∼mhudgens2008-11-17 14:37BIOS 662 1 Sampling IIOutline• Stratified sampling– Introduction– Notation and Estimands– Estimators– Allocation Strategies– ExampleBIOS 662 2 Sampling IIStratified Sampling• Stratification: The process of dividing a population of units intodistinct sub-populations called strata. Strata are formed so thateach population unit is assigned to only one stratum.• To draw a sample of US counties, we might stratify by region (NE,SE, NW, SW, ...)• How is stratification used in sample surveys?BIOS 662 3 Sampling IIStratified Sampling• The population is divided into H strata so that each population unitis a member of only one stratum.• Let Nhdenote the number of population units in stratum h for h =1,...,H.• Thus the total number of units in the population isN =H∑h=1Nh• Let nhdenote the sample size for stratum h such that the total sam-ple size isn =H∑h=1nhBIOS 662 4 Sampling IIStratified Sampling• A sample of size nhis selected by some probability design (e.g.,SRS) from each of the H strata independent of each other• Strata-specific parameters (e.g., means, totals) are estimated sepa-rately using data from each of the H strata• An estimate of the population parameter is produced by appropri-ately combining the H individual stratum estimates• If SRS used within stratum, stratified random samplingBIOS 662 5 Sampling IINotation and Estimands• Let yhidenote the variable of interest associated with unit i of stra-tum h (i = 1,...,nh; h = 1,...,H)• Let Zhi= 1 if corresponding unit in the sample, 0 otherwise• Stratum totalτh=Nh∑i=1yhi• Population totalτ =H∑h=1τh=H∑h=1Nh∑i=1yhiBIOS 662 6 Sampling IINotation and Estimands• Stratum meanµh=τhNh=∑Nhi=1yhiNh• Population meanµ =τN=∑h∑iyhiN=∑hWhµhwhere Wh= Nh/N is the proportion of population units which be-long to stratum hBIOS 662 7 Sampling IIPopulation Mean Estimator• Estimator of population mean¯y =∑hWh¯yhwhere ¯yhis an estimator of the h stratum mean µh• E( ¯yh) = µhimplies E( ¯y) = µ• Estimator of variance of ¯ydVar( ¯y) =∑hW2hdVar( ¯yh)• E(dVar( ¯yh)) = Var( ¯yh) implies E(dVar( ¯y)) = Var( ¯y)BIOS 662 8 Sampling IIPopulation Mean Estimator• If stratified random sampling, then¯yh=∑iyhiZhinhanddVar( ¯y) =∑hW2h1 − fhnhs2hwhere fh= nh/Nhis the stratum-specific sampling rate and s2his thewithin stratum sample varianceBIOS 662 9 Sampling IIPopulation Mean Estimator• CIs¯y ±t1−α/2,d fqdVar( ¯y)whered f =(∑hahs2h)2∑h(ahs2h)2/(nh−1)andah= Nh(Nh−nh)/nh• If all Nhare equal and all nhare equal, thend f = n −HBIOS 662 10 Sampling IIPopulation Total Estimator• Estimator of population totalˆτ = N ¯y =∑hNh¯yh• E( ¯yh) = µhimplies E(ˆτ) = τ• Estimator of variancedVar(ˆτ) = N2dVar( ¯y) =∑hN2h1 − fhnhs2hwith the second equality holding for stratified random sampling• E(dVar( ¯yh)) = Var( ¯yh) implies E(dVar(ˆτ)) = Var(ˆτ)• CIsˆτ ±t1−α/2,d fqdVar(ˆτ)where d f as specified aboveBIOS 662 11 Sampling IIPopulation Total Proportion• Estimator of population proportionˆp =∑hWhˆphwhere ˆphare the stratum-specific estimators; ˆp special case of ¯y• Estimator of variance for stratified random samplingdVar( ˆp) =∑hW2h1 − fhnh−1ˆph(1 − ˆph)BIOS 662 12 Sampling IIStratification Principle• Variances depend on within-stratum population variance terms only• Thus estimators will be more precise the smallerσ2h=∑i(yhi−µh)2/(Nh−1)• I.e., estimation of population mean or total will be most precise ifthe population is partitioned into strata in such a way that withineach stratum, the units are as similar as possible• E.g., in a survey of a plant or animal population, the study areamight be stratified into regions of similar habitat or elevation, sincewe expect abundancies to be more similar within strata than be-tween strataBIOS 662 13 Sampling IIStratification Principle: Example• Suppose N = 6; H = 2; Nh= 3 for h = 1,2• Stratum 1: 0,1,2 and Stratum 2: 4,5,9• Population variance σ2= 10.7; Strata variances σ21= 1, σ22= 7• For SRS with n = 4,Var( ¯y) =1 −nNσ2n=1 −4610.74= 0.89• For stratified random sampling with n1= n2= 2,Var( ¯y) =3621 −2/321 +3621 −2/327 = 0.33BIOS 662 14 Sampling IIAllocation Strategies• How to choose the sample size nhfor each stratum?• Four strategies– Proportionate: same sampling rates– Optimum: most cost efficient– Balanced: equal sample sizes– Disproportionate: unequal sampling rates (to oversample im-portant domains)BIOS 662 15 Sampling IIProportionate stratified sampling• Same sampling rate fhfor all strata:fh=nhNh=nN= f• EquivalentlyWh=NhN=nhn= wh• Proportion of the sample chosen from any given stratum will bethe same as the proportion of the population in that stratumBIOS 662 16 Sampling IIProportionate stratified sampling• Each unit in the population has the same probability of selection.This type of design is called a self-weighting design since sampleestimates of population mean and proportion are simple arithmeticmeans• E.g., the population mean estimator for proportionate stratified ran-dom sampling equals¯y =∑Hh=1∑NHi=1yhiZhinBIOS 662 17 Sampling IIStratification Principle• Claim: the variance of estimators from proportionate stratified ran-dom sampling are always less than or equal to the variance of esti-mators from SRS• Sketch of Proof (see Cochran 1977 page 99-100):(N −1)σ2=∑h∑i(yhi−µ)2=∑h∑i(yhi−µh)2+∑hNh(µh−µ)2=∑h(Nh−1)σ2h+∑hNh(µh−µ)2implyingσ2≈∑hWhσ2h+∑hWh(µh−µ)2BIOS 662 18 Sampling IIStratification Principle• ThusVarSRS( ¯y) = (1 − f )σ2/n≈ (1 − f )∑hWhσ2h/n + (1 − f )∑hWh(µh−µ)2/n• Under proportionate stratified random samplingVarpro( ¯y) = (1 − f )∑hW2hσ2h/nh= (1 − f )∑hWhσ2h/n• ThereforeVarSRS( ¯y) ≈Varpro( ¯y) + (1 − f )∑hWh(µh−µ)2/nBIOS 662 19 Sampling IIStratification Principle• For this reason, proportionate stratified random sampling is oftenconsidered default• Gains over SRS will be greatest if strata are internally homoge-neousBIOS 662 20 Sampling IIGeneral Guidelines• The stratification variable should be highly correlated with the prin-cipal characteristic being measured in the survey (e.g., age wouldbe a


View Full Document

UNC-Chapel Hill BIOS 662 - Sampling II

Download Sampling II
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Sampling II and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Sampling II 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?