POS3713 Political Science Research Midterm 3 Five Step Model for Hypothesis Testing 1 Make assumptions and determine if they are met Typically we assume random sampling central limit theorem 2 State the null hypothesis Typically there is no difference exception if we have a directional research hypothesis 3 Determine the sampling distribution and critical region 4 Calculate the test statistic The sampling distribution and the test statistic will depend on the particular hypothesis you are testing 5 Interpret the results Do we reject the null hypothesis is there a statistically significant relationship Compare your test statistic to the critical value from the sampling distribution Remember you need to calculate the degrees of freedom to determine the critical value If the test statistic is larger then reject the null Samples vs Population Sample known data to population unknown data Sample is the subset with in population hopefully chosen randomly population are all occurrences of your phenomenon of interest sample is a subset of the population of interest we use sample to infer things about the population we take samples because of the feasibility costs and practicality because if done well sample parameters traits will accurately reflect population parameters traits Statistical Inference The bridge from what we know about the sample to what we believe probabilistically to be true about the broader population we have a sample sample mean and we make inference on population based on that sample we use what we know to be true about one thing the sample to infer what is likely to be true about another thing the population we use types of data gathered about the sample a subset of cases that are drawn from an underlying population to infer certain causations of the population data for every possible relevant case Simple Random Sample Ideal standard for a sample every case in your population of interest has an equal chance of being selected as part of the sample on average parameters from a simple random sample will reflect the population parameters every case having an equal chance of appearing in the sample Sampling Distribution Taking a sample of a random an infinite number of times it is a hypothetical distribution of sample means because scientists almost never actually draw more than one sample from an underlying population at one given point in time if we took those sample means and plot them the key outcome is the sampling distribution would be normally shaped even though the underlying frequency distribution is clearly not normally shaped sets up the Central Limit Theorem In other words distribution for a calculated statistic taken from a repeated sample within a population an infinite number of time Central Limit Theorem For any trait or variable event those that are not normally distributed in the population if repeated random samples of size N are drawn from any population with mean m and standard deviation s then as N becomes large the sampling distribution of sample means will approach normality with mean and standard deviation N Generally by N becomes large a rule of thumb is N 100 The central limit theorem says that the sampling distribution will be normally distributed In other words 1 the sampling distribution is normally distributed 2 mean X from sample is a good estimate of the true mean 3 The standard deviation of sampling distribution gets smaller as sample size gets bigger Standard Error of the Mean it is the measure of uncertainty our measure of our standard deviation of our sampling distribution tells us the uncertainty in our estimate of the mean without pulling hundreds of repeated samples standard error of the mean equals the standard deviation of the sample over the square root of the sample size The formula for the standard error has the sample size in the denominator and as the denominator increases the fraction becomes smaller This means the standard error decreases as the sample size increases Smaller standard error means greater confidence in our estimate Tells us that as sample size increases the standard error decreases N 95 Confidence Interval of the Mean 95 of the sampling distribution lies plus or minus 1 96 standard deviation of the sampling distribution this assumes a large sample size to calculate the confidence interval of the mean we need to know the mean and the standard deviation N Size or n standard error C I X 1 96 the 95 confidence interval tells us the range of values in which the population mean is likely to fall It is essentially trying to give an interval of how confident we think the mean is what the highest and lowest the mean can be is The 1 96 from the equation comes from the critical values of 05 1 96 critical value of 10 1 64 this is assuming a large sample size The Different Distributions 1 Standard Normal Distribution Bell shaped and symmetric mean median and mode are all the same predictable curve line under the curve 68 95 99 Rule a normal distribution curve 68 of cases fall within one standard deviation of the mean 95 of cases fall within two standard deviation of the mean and 99 of cases fall within three standard deviation of the mean 2 t Distribution a normal distribution except it correct s for cases that has small sample sizes Comes from the quality controlled tests for beer T critical values assuming large sample size Also uses the p value of 05 and 1 critical values of 1 96 and 1 64 Critical Values the threshold you need to pass for your X 2 to assume there is a relationship and it was not caused by chance 2 2 Distribution somewhat of a normal distribution that goes along with the Tabular Analysis table given to you on the test if anything The Research Hypothesis H1 the research hypothesis is the difference we expect to see a the difference we expect to see or b the difference between our estimate and the hypothesized parameter The relationship we expect to see with 2 variables Up to now all the research hypothesis we ve done have been that 2 variable are related The Null Hypothesis H0 the null hypothesis any difference observed is just the result of random chance our goal is to reject the null hypothesis that is we want to eliminate the possibility that any difference we see is just the result of chance Up to now all the null hypothesis we ve done is that they re not related P values P probability there is no relationship ranges between 0 and 1 lower p values means there is more confidence we want less than or
View Full Document