New version page

UW STAT 220 - Sample Surveys

This preview shows page 1-2 out of 6 pages.

View Full Document
View Full Document

End of preview. Want to read all 6 pages?

Upload your study docs or become a GradeBuddy member to access this document.

View Full Document
Unformatted text preview:

SampleSurveysStat 220 Part IDesign of Experiments and StudiesLecture 5Chapter 19: Sample SurveysSampleSurveysIntroductionResearchers often want to know the truth about a largepopulation:•Who is really leading in the US Presidential race;•How many people had really died, in a war or crisissituation when the local government is unable to provide areliable estimate (e.g., Iraq, Congo).Usually, reaching everybody in the population is either tooexpensive or impossible. Only part of it can be examined. Thisis where sample surveys can be useful.SampleSurveysIntroductionDefinitionParametersNumbers representing “the truth” we wantto know about the populationsampleThe part of the population we actually getto examine (a relatively small part)Summary statis-tics, or in shortstatisticsThe numbers computed from the sample, asan estimate for the parameters.A statistic is what investigators know; a parameter is what theywant to know.SampleSurveysExample 1 (imaginary)Suppose we are interested in the percentage of U.S. householdswith access to the Internet.Population all U.S. householdsParameter True percentage of U.S. households with Internet accessSample The homes of students in today’s lecture (i.e., YOU)Statistic Percentage of you who have Internet access.Is this a good sample for estimating the population parameter?Why or why not?SampleSurveysThe Literary Digest PollUnited States Presidential Election, 1936F.D. Roosevelt (Democrat, incumbent) vs. A. Landon(Republican)The magazine Literary Digest had correctly predicted the winnerevery election from 1916 to 1932. Their prediction for 1936 was:Landon wins, and Roosevelt gets 43% of the votesTheir method:•questionnaire were sent out to 10 million people, withnames taken from telephone books and club membershiplists•2.4 million responses (> 5% of the total actual number ofvotes!)•43% of the respondents planned to vote for RooseveltSampleSurveysThe Literary Digest Pollpopulation all people who actually voted in 1936parameter percentage of votes for Rooseveltsample those 2.4 million people who resp ondedstatistic percentage of votes for Roosevelt in the sampleSampleSurveysThe results of the electionsWhat went wrong?SampleSurveysBias SourcesThere were two main problems with the Digest samplingmethods:1 Selection bias: The Digest’s 10-million-people list wasweighted against the poor, because they had no phone orclub membership. The poor voted overwhelmingly forRoosevelt; the rich tended to supp ort Landon (in previouselections, the poor voted similarly to the rich.)2 Non-response bias: Only about 1 in 4 households thatreceived a questionnaire returned it. Non-responders mayhave had different voting pattern from the people whoresponded.The lesson: When a selection procedure is biased, taking a largesample does not help. This just repeats the mistake on a largerscale.SampleSurveys1948 Presidential electionsGeorge Gallup correctly predicted the 1936 elections. Hisrepresentative quota sampling became the standard.DefinitionIn quota sampling, the sample is hand-picked by interviewers toresemble the population with respect to some keycharacteristics (e.g. residence, sex, age, income,...).interviewers are free to choose anybody they like as long as theykeep certain quotas.Then came 1948.SampleSurveys1948 Presidential electionsPredictions ElectionCandidates Crossley Gallup Roper resultsH.S. Truman (D, Incumbent) 45% 44% 38% 50%T.E. Dewey (R) 50% 50% 53% 45%SampleSurveys1948 Presidential electionsWhat went wrong with the polls?•Republicans were wealthier than Democrats.•Hence, Republicans were more likely to have phones, nicerhouses, permanent addresses, better education.•Within each demographic group, the Republicans (onaverage) were a bit easier and “more fun” to interview.•That’s why all samples included too many Republicans,and predicted the Republican candidate to win.Just like with controlled experiments, when you introducehuman choice to sampling, you get bias.SampleSurveysProbability methodsDefinitionA probability method has the following two properties:1 The procedure for selecting the sample incorporates theplanned use of chance2 It leaves no room for personal choice of theinvestigators/interviewers, regarding the final list of peoplein the sampleExamples:1 simple random sampling2 (multistage) cluster sampling3 Stratified samplingSampleSurveysSimple random samplingMethod for simple random sampling: say we want to conduct asurvey of 100 voters in a city with 10,000 eligible voters.•Write the name of each voter on a ticket•Put all the tickets in a box•Draw 100 tickets at random without replacement (there isno point in interviewing the same person twice)•Survey the people whose names have been drawn from theboxSampleSurveysSimple random samplingInterviewers may not choose whom they interview, and theprocedure is impartial - everybody has the same chance to getinto the sample.DefinitionSimple random sampling means drawing at random withoutreplacement. Whenever possible, use this technique to drawyour sample.SampleSurveysSimple random sampling:limitationsSometimes simple random sampling is not possible:•Elections:•In 1930s there was no list of all eligible voters, there wereno computers to draw a random sample from these voters.•Moreover, there were many people without phones.Sending interviewers to people all over the US would bevery expensive.•Estimating casualties in Iraq:•Investigators wanted to interview a sample of people andask them about family members who died.•There is no accurate list of all people in Iraq.•Interviewing people all over the country would involve a lotof travel, and that is dangerous.SampleSurveysMultistage cluster sampling•Divide population into clusters, for example: r egions withsimilar properties•Divide each cluster further into sub-clusters•Within each cluster, randomly select a number ofsub-clusters•In each sub-cluster, interview a random sample of people.(you can repeat the c luster/sub-cluster hierarchy as many time s asneeded)The final estimate is weighted by the relative population of eachcluster.Advantage of this method: interviewers only have to bestationed in, or travel to, the selected clusters.SampleSurveysStratified SamplingSometimes you have no technical problem drawing a simplerandom sample, but you know that there are groups in thepopulation with very different properties.Example: sex, ethnic groups, party affiliation.The problem? Survey estimates


View Full Document
Loading Unlocking...
Login

Join to view Sample Surveys and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Sample Surveys and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?