Sample proportion = The mean value of is denoted by , and the standard deviation of is denoted by . Rule 1 : , This means that the values from many different random samples will tend to cluster around the actual value of the population proportion Rule 2 : √ , Rule 3 : when n is large and is not too near 0 or 1, the sampling distribution is approximately normal. The Central Limit Theorem can safely be applied if n > 30. Central Limit Theorem is well approximated by a normal curve, even when the population distribution is not normal. confidence interval estimate specifies a range of plausible values for a population characteristic. confidence level associated with a confidence interval is the success rate of the method used to construct the interval. A statistic that is unbiased and has a small standard error is likely to result in an estimate that is close to the actual value of the population characteristic. Margin of error a statistic is the maximum likely estimation error. It is unusual for an estimate to differ from the actual value of the population characteristic by more than the margin of error. Margin of error (M) √ , solving for n = If the sample size is smaller than 10% of population size, M is adjusted by finite population correction factor √ , Since this correction factor is always less than 1, the adjusted margin of error will be smaller. confidence interval for a population proportion margin of error Interpretation of Confidence Interval You can be 95% confident that the actual value of the population proportion is included in the computed interval. Interpretation of 95% Confidence Level A method has been used to produce the confidence interval that is successful in capturing the actual population proportion approximately 95% of the time. An alternative to the large-sample z interval mod = hypotheses are always statements about population characteristics and never about sample statistics. Never state a null or alternative hypothesis using sample statistics. A hypothesis test uses sample data to choose between two competing hypotheses about a population characteristic. If the null hypothesis is not rejected, the conclusion is fail to reject y B y y ’ want to imply that you have evidence that the null hypothesis is true. P-value specifies how likely it is that a sample would be as or more extreme than the one observed if H0 were true. Test statistic. Knowing the value of the test statistic allows calculation of the corresponding P –value. test- statistic = √ H0 is true H0 is False Reject H0 Type I (α) Power (1- β) Fail to reject H0 X Type II (β) α , Power , and β n , Power , and βμ , Power , and β The power of a test is the probability of rejecting the null hypothesis. Upper-tailed test Ha: p > hypothesized value Lower-tailed test Ha: p < hypothesized value Two-tailed test Ha: p hypothesized value If z is positive, P-value = 2 (area to the right of z); If z is negative, P-value = 2 (area to the left of z) Difference between two population proportions Rule1: - Rule 2: √ Confidence interval for the difference in population proportions is - (z crit) √ Test- statistic = √ IF the H0 : P1- P2 = 0 is true, a combined estimate of the common population proportion is = Result of a hypothesis test can never show strong support for the null hypothesis. In two- y ’ convinced that there is no difference between two population proportions based on the outcome of a hypothesis test. Difference between two population means Rule1: μ1- μ 2 Rule 2: √ Test- statistic = √ , df = , V1 = , V2 = If the population variances are equal, the pooled t test has a slightly better chance of detecting departures from the null hypothesis than does the two-sample test of this section. d t* ( √ ) Random assignment to treatments is critical. Question type Study type Categorical or numerical Number of sample or treatment Method Estimation Sample Categorical variable 1 One-Sample z Confidence Interval for a Proportion Hypothesis Sample Categorical variable 1 One-Sample z Test for a Proportion Estimation Sample Categorical variable 2 Two-Sample z Confidence Interval for a difference in Proportion Hypothesis Sample Categorical variable 2 Two-Sample z Test for a difference Proportion Estimation Sample Numerical variable 1 One-Sample t Confidence Interval for a mean Hypothesis Sample Numerical variable 1 One-Sample t Test for a mean Estimation Sample Numerical variable 2 Two-Sample z Confidence Interval for a difference in mean Hypothesis Sample Numerical variable 2 Two-Sample z Test for a difference mean Hypothesis Sample Numerical variable More than 2 ANOVA F test Estimation Sample Numerical variable More than 2 Multiple comparisons At least More than x 1- P (X < x) Less than X ≤ x 1- P (X > x)ANOVA SSgroup = ∑ , df= a-1 SStotal = ∑ ∑ , df= N-1 SStotal = SSmodel + SSerror MSgroup= MSerror = , p-value = P (F(a-1, N-a) >Fobs ) Fobs = , > Fcrit, reject H0 Model 1: Grand mean model: xij µ+ε j Model 2: Group mean model: Xij µ +ε j Two-way ANOVA SStotal = ∑ , df = n-1 SSmodel =∑ , df = 1 SSerror = ∑ , df = n-2 SStotal = SSmodel + SSerror MSmodel = MSerror = , p-value = P (F(1, n-2) >Fobs ) Fobs = , > Fcrit, reject H0 Model 1: yi µ+ε Overall mean Model 2: yi α +β +ε Slope model Model 3: yij µ +ε j treatment Model 4: yij α +β j +ε j Treatment + slope Model 5: yij α +β j +ε j interaction Correlation Coefficient Strength: -1 < strong < -0.8 < moderate < -0.5 < weak < 0.5 < moderate < 0.8 < strong < 1. Least Squares Regression Line: Line that minimizes the sum of squared deviations. Residual: Difference between an observed y value and the corresponding predicted y value. Coefficient of Determination: Proportion of variability in y that can be attributed by the relationship of x and y. Standard Deviation of Least Squares Regression Line: Typical amount by which an observation deviates from the least squares regression line. r = ∑ = √ Sum of Squared Deviations = ∑[ ]
View Full Document