Unformatted text preview:

Review of Chapters 1- 62. Sampling and MeasurementRandomization – the mechanism for achieving reliable data by reducing potential biasExperimental vs. observational studies3. Descriptive StatisticsSlide 6Sample statistics / Population parameters4. Probability DistributionsLike frequency dist’s, probability distributions have mean and standard deviationNormal distributionNotes about z-scoresSampling dist. of sample meanCentral Limit Theorem: For random sampling with “large” n, sampling dist of sample mean is approximately a normal distributionSlide 145. Statistical Inference: EstimationConfidence Interval for a Proportion (in a particular category)Finding a CI in practiceCI for a population mean6. Statistical Inference: Significance TestsFive Parts of a Significance TestSlide 21Slide 22Significance Test for MeanSignificance Test for a Proportion Slide 25Error TypesLimitations of significance testsReview of Chapters 1- 6We review some important themes from the first 6 chapters1. Introduction•Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods for•Design - Planning/Implementing a study•Description – Graphical and numerical methods for summarizing the data•Inference – Methods for making predictions about a population (total set of subjects of interest), based on a sample2. Sampling and Measurement•Variable – a characteristic that can vary in value among subjects in a sample or a population.Types of variables•Categorical •Quantitative •Categorical variables can be ordinal (ordered categories) or nominal (unordered categories)•Quantitative variables can be continuous or discrete•Classifications affect the analysis; e.g., for categorical variables we make inferences about proportions and for quantitative variables we make inferences about means (and use t instead of normal dist.)Randomization – the mechanism for achieving reliable data by reducing potential biasSimple random sample: In a sample survey, each possible sample of size n has same chance of being selected.Randomization in a survey used to get a good cross-section of the population. With such probability sampling methods, standard errors are valid for telling us how close sample statistics tend to be to population parameters. (Otherwise, the sampling error is unpredictable.)Experimental vs. observational studies•Sample surveys are examples of observational studies (merely observe subjects without any experimental manipulation)•Experimental studies: Researcher assigns subjects to experimental conditions.–Subjects should be assigned at random to the conditions (“treatments”)–Randomization “balances” treatment groups with respect to lurking variables that could affect response (e.g., demographic characteristics, SES), makes it easier to assess cause and effect3. Descriptive Statistics•Numerical descriptions of center (mean and median), variability (standard deviation – typical distance from mean), position (quartiles, percentiles)•Bivariate description uses regression/correlation (quantitative variable), contingency table analysis such as chi-squared test (categorical variables), analyzing difference between means (quantitative response and categorical explanatory)•Graphics include histogram, box plot, scatterplot•Mean drawn toward longer tail for skewed distributions, relative to median. •Properties of the standard deviation s:• s increases with the amount of variation around the mean•s depends on the units of the data (e.g. measure euro vs $)•Like mean, affected by outliers•Empirical rule: If distribution approx. bell-shaped,about 68% of data within 1 std. dev. of meanabout 95% of data within 2 std. dev. of meanall or nearly all data within 3 std. dev. of meanSample statistics / Population parameters•We distinguish between summaries of samples (statistics) and summaries of populations (parameters). Denote statistics by Roman letters, parameters by Greek letters:•Population mean = standard deviation = proportion are parameters. In practice, parameter values are unknown, we make inferences about their values using sample statistics.4. Probability DistributionsProbability: With random sampling or a randomized experiment, the probability an observation takes a particular value is the proportion of times that outcome would occur in a long sequence of observations.Usually corresponds to a population proportion (and thus falls between 0 and 1) for some real or conceptual population.A probability distribution lists all the possible values and their probabilities (which add to 1.0)Like frequency dist’s, probability distributions have mean and standard deviationStandard Deviation - Measure of the “typical” distance of an outcome from the mean, denoted by σ If a distribution is approximately normal, then:•all or nearly all the distribution falls between µ - 3σ and µ + 3σ•Probability about 0.68 falls between µ - σ and µ + σ( ) ( )E Y yP ym= =�Normal distribution•Symmetric, bell-shaped (formula in Exercise 4.56)•Characterized by mean () and standard deviation (), representing center and spread•Prob. within any particular number of standard deviations of  is same for all normal distributions•An individual observation from an approximately normal distribution satisfies:–Probability 0.68 within 1 standard deviation of mean–0.95 within 2 standard deviations –0.997 (virtually all) within 3 standard deviationsNotes about z-scores•z-score represents number of standard deviations that a value falls from mean of dist.•A value y is z = (y - µ)/σ standard deviations from µ•The standard normal distribution is the normal dist with µ = 0, σ = 1 (used as sampling dist. for z test statistics in significance tests)•In inference we use z to count the number of standard errors between a sample estimate and a null hypothesis value.Sampling dist. of sample mean• is a variable, its value varying from sample to sample about population mean µ. Sampling distribution of a statistic is the probability distribution for the possible values of the statistic •Standard deviation of sampling dist of is called the standard error of •For random sampling, the sampling dist of has mean µ and standard errorpopul. std. dev.sample sizeynss = =yyyyyCentral Limit Theorem: For random sampling with


View Full Document

UF STA 6126 - Review of Chapters 1-6

Download Review of Chapters 1-6
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Review of Chapters 1-6 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Review of Chapters 1-6 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?