Unformatted text preview:

1 Ch 1 1 Populations Samples and Processes Data information that is analyzed Contains individuals and variables Population entire group of individuals in interest massive Draw a sample smaller portion of individuals from population Parameter a numeric trait of the population Usually unknown ex average and percent Statistic a numeric trait of the sample Usually used to estimate the parameter Quantitative variables numeric values that can perform arithmetic Continuous measured on a continuous scale interval Discrete can list possible values Categorical variables placed in groups categories Ordinal can be arranged in a rank order Nominal no meaningful rank or order Distribution describes what values a variable takes and how often it takes them Ch 1 2 Visual Displays for Univariate Data Descriptive Statistics Categorical pie charts and bar graphs Quantitative histograms stem plots dot plots and box charts Shape overall pattern of data symmetric vs skewed center spread Positive skew most observations are on the right Negative skew most observations are on the left Symmetric most observations are in the middle and evenly spaced Center point that splits the data into half on either side Spread how far apart the data values reach min to max Outliers observations outside of general pattern Frequency number of times the value occurs in the data set Relative Frequency proportion of times the value occurs in the data set 2 Ex C Interval 180 200 200 220 Count Frequency 1 8 Proportion Relative Frequency 1 25 8 25 Ch 1 3 Describing Distributions Density function f x is used to describe the population or process distribution of a continuous variable x whose curve is called a density curve Probability Mass Function X p x Continuous distributions X 0 1 random 0 X 1 0 f x f x dx 1 b a x b f x dx a f c height of f when x c Discrete distributions 0 f x 1 f x 1 a x b f a f a 1 f b f x longrun proporti on where X x Exponential distribution x f x e x 0 0 otherwise 3 Ch 1 4 The Normal Distribution 2 x u 1 f x e 2 x 0 2 2 x mean center standard deviation spread x z Number of away from 2 proportionof values less than x z x x b a proportiono f values a x b z z proportionof values greater than x 1 z Empirical Rule 66 within 1 95 within 2 99 7 within 3 Weibull Distribution x 1 f x x e x 0 0 Ch 1 6 Several Useful Discrete Distributions Binomial Distribution 1 n fixed trials 2 results in a success or failure discrete 4 3 trials independent 4 proportion of success remains the same n x n x 1 n x n e p x n x n x x 0 Ex Five fair coins flipped independently 5 2 5 2 P x 2 0 5 1 0 5 0 3125 2 5 2 5 P x 2 0 52 1 0 5 5 2 0 3125 2 5 2 x p x 0 0 03125 1 0 15625 2 0 3125 3 0 3125 4 0 15625 Ex Products defective with 0 22 probability sold in packages of 10 10 1 10 1 P x 1 0 22 1 0 22 0 2351 1 10 1 10 P x 1 1 p 0 1 0 220 1 0 22 10 0 1 0 08336 0 9166 0 10 0 P x 2 p 0 p 1 p 2 Poisson Distribution e x p x x p x 1 x 0 Ch 2 1 Measures of Center n 1 Mean x x i n i 1 Median x midpoint of the data Median of a continuous distribution f x dx 12 Outliers pull the mean but not the median Mean and median measure the center in different ways 5 0 03125 5 Trimmed Mean ignores extreme outliers 10 trimmed mean drops 10 at top and 10 at bottom Calculate remaining average Measures for Center of Distribution Discrete E x x f x Continuous E x xf x dx Binomial E x n Poisson E x Exponential E x 1 Ch 2 2 Measures of Variability Range difference between min and max Sample Variance 2 x x Continuous s 2 i n 1 Average squared deviations from the mean n 1 degrees of freedom 2 s 0 if all observations are the mean Larger spread from x larger s 2 Not resistant to outliers Sample Standard Deviation s s2 Measures the spread Variance Discrete 2 x 2 p x Binomial 2 n 1 Poisson 2 Continuous Mean Variance x 2 f x dx 2 Data Sample x s Distribution Population 2 6 Ch 2 3 More Detailed Summary Quantities Lower Quartile Lower 25 of data q1 f x dx 0 25 Upper Quartile Upper 25 of data 75 of data below f x dx 0 25 q3 Interquartile Range IQR Q3 Q1 Measure of variability resistant to the effect of outliers and include in Q1 and Q3 If n odd middle Five Number Summary Min Q1 Median Q3 Max Box Plots show five number summary Modified Box Plots show five number summary and outliers Mild Outliers beyond 1 5 IQR but within 3 IQR Extreme Outliers beyond 3 IQR Percentiles separate smallest 100p from remaining values p f x dx p Normal Quantile Plot plot of the z quantile observation pairs Linearity supports the assumption of normality Ch 3 1 Scatter Plots Scatterplot Representation of two variable data Establishes relationship between explanatory and response variables Some identical x values have different y values 7 y is not solely determined by x Value of y may be predicted by finding a linear fit for x Correlation Coefficient assessment of the strength of the relationship between x and y 1 r 1 r 1 strong negative linear relationship r 0 no linear relationship r 1 strong positive linear relationship r not dependent on units of measurement for either variable Value of r not dependent on which variable is explanatory Ch 3 3 Can predict y from x with linear relationship y a bx a y b x S b r y Sx Residual y y difference between actual and predicted 2 y minimizes SSResid y i yi Variation in y not described by y SSTo y i y 2 total variance in y Coefficient of determination SSResid r 2 1 SSTo Proportion of variation in y described by y 2 0 r 1 r 1 linear relationship describes data extremely well r 0 linear relationship does not describe data Residual Plot plot of the x residual pairs 8 Random scatter suggest linearity Extrapolation use of y to make predictions outside of observed data Inaccurate as pattern may not hold true Extrapolation use of y to make predictions within observed data Ch 3 4 Nonlinear Relationships Power transformation x xP y yP Linearize the situation by transforming x and or y 4 x y 1 x y x y 2 x y 3


View Full Document

Cal Poly STAT 314 - STAT 314 Notes

Loading Unlocking...
Login

Join to view STAT 314 Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view STAT 314 Notes and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?