UC STAT 2037 - 3. Descriptive Statistics 2

Unformatted text preview:

Descriptive Statistic Part II Laura Portell 1 Normal Distribution The Normal distribution is a probability distribution that is symmetric about the mean showing that data near the mean are more frequent in occurrence than data far from the mean 2 Normal Distribution Empirical Rule About 68 2 of the data lies within plus or minus 1 standard deviation of the mean About 95 4 of the data lies within plus or minus 2 standard deviation of the mean About 99 7 of the data lies within plus or minus 3 standard deviation of the mean 3 Normal Distribution Example Suppose that the heights of a sample women are normally distributed Mean height is 166cm Standard deviation is 6 cm We can generalize that 68 2 of population are between 160 cm and 172 cm 4 Normal Distribution Can have any mean and any positive standard deviation The mean gives the location of the line of symmetry The standard deviation describes the spread of the data 5 Normal Distribution Same Mean Different Mean Different Standard Deviation Same Standard Deviation All normal distributions can be described by just two parameters the mean and the standard deviation 6 Normal Distribution Example Which normal curve has the greater mean Which has the greater standard deviation 7 Standard Normal Distribution A standard distribution with mean 0 and standard deviation 1 The Standard Normal Distribution Z distribution is a way of standarizing the normal distribution A common practice is to convert any normal distribution to the standarized form and then use the standard normal table to find probabilities 8 Standard Normal Distribution Any x value can be transformed into a z score by using the formula 9 Normal Distribution Z table Used to find probabilities associated with the standard normal curve 10 11 Standard Normal Distribution With this information you can determine the area under the curve that is To the right of your data point To the left of the data point Between two data points Outside of two data points 12 Normal Distribution Example The scores on a certain college entrance exam are normally distributed with mean 82 and standard deviation 8 Approximately what percentage of students score less than 84 on the exam First we will nd the z score associated with an exam score of 84 z x 84 82 8 2 8 0 25 Next we will look up the value 0 25 in the z table Approximately 59 87 of students score less than 84 on this exam 13 Normal Distribution Example The weight of a certain species of dolphin is normally distributed with a mean of 400 pounds and a standard deviation of 25 pounds Approximately what percentage of dolphins weigh between 410 and 425 pounds First we will nd the z scores associated with 410 pounds and 425 pounds z1 x 410 400 25 10 25 0 4 z2 x 425 400 25 25 25 1 First we will look up the value 0 4 and 1 in the z table Lastly we will subtract the smaller value from the larger value 0 8413 0 6554 0 1859 Thus approximately 18 59 of dolphins weigh between 410 and 425 pounds 14 Normal Distribution Example What is P Z 1 5 P Z 1 5 P Z 1 5 0 9332 What is P Z 1 5 P Z 1 5 1 P Z 1 5 1 0 9332 0 0668 What is P 0 5 Z 1 0 P Z 1 0 P Z 0 5 0 8413 0 3085 0 5328 15 Shape Skewness and Kurtosis The understanding shape of data is a crucial action It helps to understand where the most information is lying and analyze the outliers in a given data The types of skewness and kurtosis are related to the shape in the given dataset 16 Skewness Skewness is a degree of asymmetry observed in a probability distribution that deviates from the symmetrical normal distribution 17 Skewness 1 Positive skewed or right skewed In statistics a positively skewed distribution is a sort of distribution where the measures are dispersing the mean median and mode of the distribution are positive rather than negative or zero 2 Negative skewed or left skewed A negatively skewed distribution is the straight reverse of a positively skewed distribution The mean median and mode of the distribution are negative rather than positive or zero 18 Skewness Pearson s second coefficient of skewness Value of skewness rule of thumb skew 0 the data are nearly symm perfect symmetric skew between 0 5 and 0 5 approximately symmetric skew between 1 and 0 5 or between 0 5 and 1 slightly skewed skew more than 1 or less than 1 highly skewed 19 Kurtosis Kurtosis is a statistical measure that is used to describe distribution Kurtosis measures extreme values in either tail Kurtosis refers to the degree of presence of outliers in the distribution 20 Kurtosis There are three types of distributions Leptokurtic kurtosis is over traditional distribution Sharply peaked with fat tails and less variable K 3 dispersed K 3 Mesokurtic kurtosis is comparable to traditional distribution Medium peaked K 3 Platykurtic kurtosis is a smaller amount common than normal Flattest peak and highly 21 Bivariate descriptive statistics Bivariate analysis is an analysis that is performed to determine the relationship between 2 variables We often want to study the effect of one variable on another one For example you might want to test whether students who spend more time studying get better exam scores The variables in a study of a cause and effect relationship are called the independent and dependent variables The independent variable is the cause the variable manipulated by an experimenter The dependent variable is the effect the event expected to change when the independent variable is manipulated 22 Bivariate descriptive statistics Common types of bivariate analysis include 1 Scatter plots A scatterplot is a type of data display that shows the relationship between two numerical variables These give you a visual idea of the pattern that your variables follow 23 Bivariate descriptive statistics 2 Regression Analysis Regression analysis is a set of statistical methods used for the estimation of relationships between a dependent variable and one or more independent variables Regression analysis includes several variations such as linear multiple linear and nonlinear We only see simple linear 24 Bivariate descriptive statistics Simple Linear Regression Simple linear regression is a model that assesses the relationship between a dependent variable and an independent variable The simple linear model is expressed using the following equation Y a bX Where Y Dependent variable X Independent variable a Intercept value of y when the x is 0 b Slope regression coefficient Error 25 Bivariate descriptive statistics 3 Correlation


View Full Document

UC STAT 2037 - 3. Descriptive Statistics 2

Download 3. Descriptive Statistics 2
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view 3. Descriptive Statistics 2 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view 3. Descriptive Statistics 2 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?