Unformatted text preview:

Chapter 7 Statistical Inference Populations and samples Two types of data that social scientists use o Data about the population data for every possible relevant case Ex US Census Rare to use data though from entire population o Data drawn from sample a subset of cases drawn from underlying population Samples aren t necessarily always random sometimes samples of convenience Most analyses are drawn from sample We aren t interested in the properties of the sample just the part of the sample that helps us to learn about the underlying population Statistical inference what we know to be true sample to infer what is likely to be true about something else population Implications for using sample data Learning about the population from a sample The central limit theorem Ex Want to know the outcome of a poll however doesn t make sense to o Uncertainty perform a census News organizations use surveys on a sample to generalize the population Central limit theorem explains how we can collect a few samples and allow them to represent the general public The normal distribution normal doesn t mean typical Normal distribution is the bell curve o Symmetrical about mean o Mode mean median are all equal o Predictable area under the curve within specified distances of mean Start from mean and go one standard deviation either way 68 of area under the curve One additional standard deviation will make up for 95 Going a third standard deviation will capture 99 of area o 68 95 99 rule only applies to normal distribution Frequency distribution distribution of actual scores in a sample represent frequency of each value of a particular variable Ex rolling a 6 sided dice 6oo times sample mean is a bit lower than what you would believe it to be 3 47 vs 3 5 o Sampling distribution would be normally shaped o Underlying frequency distribution would not be normally shaped Sampling distribution hypothetical distribution of sample means hypothetical because scientists don t normally draw from more than one sample from a population at one time Envision infinite number of random samples and plot sample means to each sample means would be normally distributed o mean of sampling distribution true population mean Standard error standard deviation of the sampling distribution of sample means o Equal to sample standard deviation square root of sample size Sample of 600 might be different from the true population mean by a little either too high or too low Use knowledge that sampling distribution is shaped normally and invoke the 68 95 99 rule to create confidence interval o Chose degree of confidence scientists usually chose 95 which has us start at our mean 3 47 then move 2 standard deviations in each direction to be precisely 95 confident move 1 96 deviations o It is possible that we are wrong and the population mean lies outside these intervals 2 5 chance that the mean is higher or lower than the deviations that we set o Want to be 99 confident Move 3 standard deviations in each direction Presidential approval ratings the best guess for the population mean value is the sample mean value plus or minus 2 standard errors The plus or minus figures we see are typically built on the 95 interval Central limit theorem only applies to samples that are selected randomly o A nonrandom sample of convenience doesn t help much in making connections between the sample and the population under study The smaller the standard errors the tighter our resulting confidence intervals will be The larger the standard errors the wider confidence intervals will be o Since you want it to be tighter a larger sample size will benefit you o Reduce the size of the standard errors o This is easier said than done though because of the cost time o If your interval is too wide it becomes uninformative How we use samples to learn something about the underlying population Chapter 8 Bivariate Hypothesis Testing Bivariate hypothesis tests and establishing causal relationships Not used very often presently Help us answer the question Are x and y related Bivariate two variables Not helpful in answering the question is there some confounding variable z that is related to both x and y and makes the observed association between x and y spurious Choosing the right bivariate hypothesis test Consider nature of independent and dependent variables before choosing Tabular analysis cases where independent and dependent variables are described as categorical Difference of means test when the independent variable is categorical and the dependent variable is continuous Binomial probit binomial logit model when independent variable is continuous and dependent variable is categorical Correlation coefficient when both dependent and independent variables are continuous All roads lead to p P value probability Common element among hypothesis tests Value ranges between 0 and 1 bottom line The logic of p values 3rd causal hurdle is there covariation between x and y Each test follows common steps o Compare actual relationship between x and y in sample data with what we would expect to find if x and y weren t related in the underlying population The more different the relationship is from what we expected to find if there wasn t a relationship the more confidence we have that x and y are related in the population P value probability that we would see the relationship we are finding because of random chance the probability that we d see the observed relationship between the variables in our sample data if there was no relationship between them in the unobserved population o Lower the p value greater confidence of systematic relationship o More data on which measurement is based lower p value Limitations of p values P value IS NOT o Reversible Ex P 001 doesn t mean that there s a 999 chance of something happening o Tells whether relationships are causal o When p value is close to 0 relationship of x and y is strong o Larger sample size makes relationship stronger However it does increase our confidence that observed relationship accurately represents the underlying population o Directly reflect the quality of the measurement procedure for our variables If you aren t confident in measurements should be less confident in p value P values are always based on the assumption that you re drawing a perfectly random sample from the underlying population Truly random sample the probability of an individual case from our population ending in our sample pi is assumed to equal P for all of the


View Full Document

FSU POS 3713 - Chapter 7 Statistical Inference

Documents in this Course
Ch. 1

Ch. 1

10 pages

Notes

Notes

22 pages

EXAM #1

EXAM #1

40 pages

Exam 3

Exam 3

4 pages

Midterm 1

Midterm 1

18 pages

Midterm 2

Midterm 2

36 pages

Midterm

Midterm

22 pages

EXAM 1

EXAM 1

34 pages

Exam 4

Exam 4

17 pages

Midterm 2

Midterm 2

36 pages

Test 3

Test 3

3 pages

Test 1

Test 1

5 pages

Test 3

Test 3

8 pages

Midterm 1

Midterm 1

20 pages

Midterm 3

Midterm 3

24 pages

Midterm 3

Midterm 3

24 pages

Midterm 1

Midterm 1

19 pages

Exam 3

Exam 3

19 pages

Exam 2

Exam 2

17 pages

Exam 4

Exam 4

23 pages

Midterm 2

Midterm 2

12 pages

TEST 1

TEST 1

40 pages

UNIT 1

UNIT 1

21 pages

Load more
Download Chapter 7 Statistical Inference
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Chapter 7 Statistical Inference and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Chapter 7 Statistical Inference and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?