Chapter 1 2 Statistics Variables Data Descriptive graphical numerical Inferential how to generalize based on sample to population quantitative numerical numbers measurements discrete gap btwn s or continuous use decimals qualitative categorical classify observations ordinal rank GPA or nominal no order Data o o o o Interval no meaningful zero temperature 0 still has value Ratio meaningful zero income 0 has no value Time series measures 1 item over time one stock Cross sectional multiple items at one time compare stocks Chapter 3 Sampling Bias Statistical Sampling SRS everyone has equal chance of selection Stratified groups are diff take SRS from each group combine data Cluster groups are similar take census from one few groups Systematic every kth person k sample population Non Statistical Sampling voluntary people don t have to respond email survey convenience ask whoever is around selection exclude certain people non response large amt of people don t respond voluntary ppl w strong opinions answer advertisement undercoverage not enough ppl sampled Bias Chapter 4 Tables Charts explanatory var causes another x value response var effected by explain y value marginal distribution distrib of one variable with respect to the other conditional distribution if relative bar chart has each heights the variables are independent Simpson s paradox when you combine data you get wrong answer Chapter 5 Box Plots Histograms Shape unimodal one mode bimodal 2 modes tail part of graph that trails off skewed means some extreme values don t follow bulk of data left mode on the right right mode on the left symmetric mode in the middle uniform flat no mode Center Spread of Distribution Box plot Box Plot mean avg of data use with symmetrical data robust influenced by outliers skewed use median IQR symmetric use mean SD median middle of data use when skewed not affected by outliers o o o Plot Q1 median Q3 max min Calculate plot fences upper Q3 1 5 Q3 Q1 lower Q1 1 5 Q3 Q1 Determine outliers IQR Q3 Q1 middle 50 of data ignores outliers Variance s2 x xbar 2 n 1 standard deviation squared in units2 St dev s x xbar 2 n 1 avg squared distance from the mean in units Z score value means st dev how many st dev a value is from the mean Chapter 6 Scatter Plots Correlation Scatter plots o does not imply causation residual observed predicted distance of a point from the line of regression negative overestimate positive underestimate correlation coefficient r strength of a line r zxzy n 1 zx x xbar sx zy y ybar sy Sx stdev of x Sy stdev of y r2 correlation squared shows of y explained by x changes in x explain r of variations in y lurking variable something that affects 2 unrelated variables y b0 b1x b1 r Sy Sx Zy rZx y y Sy r x x Sx standard error sq residuals n 2 tells spread about the regression line sum of residuals 0 Chapter 7 Probability multiplication rule P A B P A xP B if A B are independent addition rule P A or B P A P B if A and B are independent compliment rule P Ac 1 P A gen addition rule P A or B P A P B P A B if not mutually exclusive can happen at the same time gen multiplication rule P A B P A xP B A if B affects A picking cards conditional P B A P A B P A independent occurrence of A does not affect B rolling dice dependent occurrence of A DOES affect B dealing cards mutually exclusive if A occurs then B cannot getting 2 grades on 1 test AKA disjoint Regression Other
View Full Document