STAT 110 1nd Edition Lecture 8 Outline of Previous Lecture I. Types of Variables II. Distributions of Variables Outline of Current LectureI. Shapes of Distributions II. StemplotsIII. Measures of CenterCurrent LectureI. Shapes of Distributions a. A distribution is symmetric if the right side of the histogram and the left side of the histogram looks the same. An example of a symmetric distribution is a Bell curve. b. A distribution is skewed to the right if the tail of the distribution points to the right. Data that is skewed to the right implies that there are outliers on the high end of the distribution. Example: income. c. A distribution is skewed to the lef if the tail of the distribution points to the left. Data that is skewed to the left implies that there are outliers on the lower end of the distribution. Example: birth weight. d. Overall Patterns of a Distribution i. Shape, Center, and Spread all describe the overall pattern of a distributionii. An outlier is a single observation or value that is outside of the overall pattern of the data. 1. It is necessary to perform calculations before we can say that an extreme value is indeed an outlier. An outlier can either be an extreme value or it can indicate that we have made a mistake in our measurements. II. Stemplots a. A stemplot is a way to quantify data that is similar to a histogram, but in a stemplot, the values of the data are maintained. i. To create a stemplot:1. Separate the data into stems and leaves. So, if you have data points: 12, 14, 22, 35, your stems would be 1, 2, and 3, and your These notes represent a detailed interpretation of the professor’s lecture. GradeBuddy is best used as a supplement to your own notes, not as a substitute.leaves would be 2, 4, 2, and 5. A quick sample of a stemplot with our data is shown below:1 2 42 23 52. Write the stem values in a vertical column from smallest to largest 3. Write each leaf, in increasing numerical order, in the row next to the appropriate stemii. Note: stemplots should not contain decimal points, commas, or any form of punctuation; leave a space holder if there is not leave for a particular stem in your data set; the leaves should be lines up with each other vertically so that we can see the shape of the distribution.III. Measures of Center a. The median is the midpoint of a distribution. Half of the data points in a distribution are higher than the median, and half of the data points in a distribution are lower than the median. To find the median of a distribution:i. Arrange all of the data in order from smallest to largest ii. If the number of data points n is an odd number, the median is the centernumber, if the number of data points is an even number, the median is the average of the two center data points. iii. The median is not affected by outliers. b. The mean is the simple arithmetic average of the distribution. To find the mean of a distribution:i. Add up all of the values in the distribution, and then divide by the total number of data points. The mean is also known as the “balancing point” of the distribution. For a variable, x, “x-bar” is the mean. X-bar = the sum of all observations divided by the number of observations. In other words x-bar = (x₁ + x₂ …)/n ii. The mean is very strongly affected by outliers. c. The mode is the value that occurs most often in the data set. There does not have to be a mode, but there can often be more than one mode. For example, in this data set: 1, 2, 2, 2, 3, 4, 5, the mode is 2, because it occurs more frequently than any of the other data
View Full Document