**Unformatted text preview:**

STATS 250 NOTES condensed Exam 1 Report decimal places 3 4 places after the decimal point Chapter 1 Stats investigation process ADEUFR 1 Ask a Research Question 2 Design a study and collect data 3 Explore the data provide graphical displays and numerical summaries 4 Use statistical analysis methods to draw inferences from the data 5 Formulate conclusions communicate findings and answer research questions 6 Reflect and look forward point out limitations and suggest further studies When data is not collected for question ITEUFR 1 Import the data 2 Tidy the data 3 Explore the data provide graphical displays and numerical summaries 4 Use statistical analysis methods to draw inferences from the data 5 Formulate conclusions communicate findings and answer research questions 6 Reflect and look forward point out limitations and suggest further studies Categorical Variables or qualitative variables place an individual or item into one of the several groups or categories which are called Nominal variables vs Ordinal variables ordinal follows specific order small medium levels large They report frequency counts Relative frequency Decimals or Percent Graphs used to display bar charts pie chart Example Ice cream flavors are a categorical variable and how many flavors are the levels and how many people like which flavor would be frequency counts and when you report the counts in decimals or percentages its considered relative frequency Numerical Variables or quantitative or measurement variables that take on a wide range of numerical values and it is sensible to do math with numerical variables Discrete numerical variables with jumps Ex 1 2 3 4 5 Continuous variables Can take any value in an interval or collection of intervals Ex 2 3 Numerical variables can have subtypes Ex ages 18 25 25 34 35 44 It s possible to turn numerical values to categorical by grouping them but not categorical 6 7 7 808 8 9999 values to numerical Graphs Histograms boxplots scatterplots Contingency tables one way data is summarized with tables Data Matrix Common way to organize raw unprocessed data They have columns and rows Population The entire group we are interested in learning about All undergraduate students in the US We dont observe every case in a study Would be time consuming and costs too much and sometimes can destroy the item in the process of measurement Sample A subset of the cases that is often a small fraction of the overall population Undergraduate students un UMICH It provides an estimate for the overall population less time consuming and less costly There could be biases in a sample so the way we sample is important Biases Convenience sampling Response bias Non response bias Anecdotal Evidence Typically composed of unusual cases that are recalled based on their striking characteristics Sampling from a Population To draw inferences about a population 1 The sample must be representative of the entire population 2 Use random sampling subjects of study experiment should be selected randomly to ensure the sample is representative Explanatory and Response variables Explanatory a variable that predicts the outcome Response a variable that is the outcome responds to the explanatory variable Two types of data collection Observational and Experimental Observational Studies refer to instances where researchers collect data in a way that does not directly interfere with how the data arise They simply observe Interested in looking at the relationship between two or more variables Data is usually collected only by monitoring what occurs Making causal inferences based on observational studies is difficult but not impossible Experiments researcher directly influences the process by which data arise Subjects are usually assigned to one or more treatments RANDOMLY There is usually a control variable or a placebo effect Require the primary explanatory variable in a study be assigned to each subject by Making causal conclusions is reasonable depending on the way the explanatory researchers variable is assigned Simple Random Sample SRS Observational Studies Of n observations from a population is one in which each possible sample of that size has the same chance of being in the sample that is selected So every member case in a population has an equal chance of being included and there is no implied connection between members cases in the sample The best way of ensuring that the sample is representative of the population it is chosen from Stratified Sampling For experiments Makes groups similar as possible dispersing confounding factors evenly between groups with only difference being due to the treatment Strata Group of individuals or cases in a population who share characteristics thought to be associated with the variable we want to measure A divide and conquer sampling strategy The population is divided into non overlapping strata Each SRS is taken from each stratum Works when there is variability between each stratum but not much variability within each stratum Convenience Sampling Refers to samples that are obtained by measuring whatever or whoever is available to be measured So nearest cases and subjects convenient Rarely representative of a larger population The sampling method is often biased BIAS in sampling Results obtained based on a survey are biased if the method used to obtain those results would consistently produce values that are either too high or too low Selection Bias occurs when the method for selecting participants produces a sample that does not represent the population of interest Ex convenience sampling Nonresponse Bias occurs when a representative sample is chosen for a survey but a subset cannot be contacted or does not respond Ex Voluntary response sample or surveying only a specific people and not the entire population Response Bias Occurs when participants respond differently from how they truly feel The way questions are worded the way the interviewer behaves as well as many other factors might lead an individual to provide false information Sampling Bias A type of bias that occurs when the method for selecting participants causes some individuals in the population to more or less likely to be included in the sample than others Ex Convenience sampling any type of sampling that is not random Limits of observational variables Confounding variables are variables that are associated with both the explanatory and response variables These are hidden variables that affect the conclusions we make

View Full Document