Unformatted text preview:

Chapter 4Box-and-Whisker Plots1What is this chapter about? It’s about taking data - possibly thousands of numbers -and finding a few measures (values) that help you make sense of the data and represent iteffectively. You are probably already familiar with many of these tools, but may not haveused them in the way that we describe here.• Section 4.1 (page 104) of the chapter shows you how to reduce the data to a singlenumber representing the central tendency of the data.• Section 4.2 (page 111) of the chapter shows you how to reduce the data to severalnumbers and then represent these numbers in a graph.• As a result of this chapter, students will learn√What a statistic is and what it is used for√What an average is and what the common ways of determining an average are√What quartiles are and what they tell you about data√What an outlier is√What a boxplot is, how to read the information in a boxplot, and how to interpretboxplots√How to compare data sets in order to answer real-world problems• As a result of this chapter, students will be able to√Compute various summary statistics by hand, with Excel, and with add-ins likeStatPro√Make a boxplot by hand or with StatPro√Incorporate graphs made in Excel into a Word document effectively to supportyour work√Refer to cells in Excel in order to use them in calculations√Explain what happens to various statistics if the data is increased by a constantamount or by a fixed percentage1c2011 Kris H. Green and W. Allen Emerson103104 CHAPTER 4. BOX PLOTS4.1 What Does ”Typical” Mean?So far, we’ve got a lot of information: spreadsheets filled with data that we arranged intovariables and observations. But what do we do with all this? Unless you’re really special,you probably can’t learn a lot from looking at a list of one thousand numbers. You probablyknow even less from looking at a thousand observations for each of four different variables.Sets of data in business and science are usually larger than this, so we need to think ofsomething fast.The key is to take it slowly. Rather than look at the entire set of data, we want to lookat the data one variable at a time in order to find out what that one variable tells us aboutthe situation about which we collected data. To make things even easier, we want to reducethe data down to one number that represents the ”typical” data point for that variable. Ingeneral, a number used to represent an entire variable is called a statistic. If that statisticis meant to represent the typical data point, we call it an average.Watch out, though, the word ”average” doesn’t really mean what you probably think itdoes. It has a much more general meaning than ”add up the data and divide by the numberof data points.” That’s only one method of computing an average. There are many others.In this chapter, we’re interested in the three most common averages: the mean, the median,and the mode.Another way to think of an average comes from the phrase central tendency. This refersto the middle of the data. You’ll always have some data above the average and some belowit. The average is a way of talking about the middle of the data. The three described here(mean, median, mode) are the most commonly used ways to compute the middle. Eachhas a different meaning and has different applications. All are correct ways to compute themiddle; it’s just that sometimes one is more appropriate than the others. When you goabout computing an average you may need to check all three statistics (mean, median, andmode) of these in order to determine which of these will be the most appropriate measureof the typical data point.If you’ve understood the ideas above, you might be amused by the statement below,which was issued by Joan Barb Briggs, the president of Generic University, in a moment ofadministrative desperation:By the end of the next academic year, I want all of our instructors to have above averagecourse evaluations.4.1.1 Definitions and FormulasStatistic Any number used to represent many observations of a single variable or thatrelates several variables togetherAverage A statistic that is intended to provide a measure of what a ”typical” data pointis for a single variableMean An average computed by adding all the observations of a variable together and thendividing by the number of observations. In symbols, the mean of the data xiis ¯x =Pxi/n. This is more properly called the arithmetic mean. Excel uses AVERAGE for4.1. WHAT DOES ”TYPICAL” MEAN? 105the mean, which is the most commonly used average, and it is the most robust average(it will change the least under repeated sampling of the population)Median An average computed by first ordering the observations from smallest to largestand then finding the number that splits the observations in half. Observe that thisnumber may or may not be a data point, depending on whether there are an even orodd number of observations. 50% of the observations are less than or equal to themedian and 50% are greater than or equal to the median. If there are an even numberof points, the median is in between the two center numbers (see example 2 (page 105))Mode An average computed by determining which observation(s) is repeated most often(or most frequently). The mode is not necessarily unique, nor is it guaranteed to evenexist. This is really only useful for discrete numerical data with a few possible valuesor for categorical data4.1.2 Worked ExamplesExample 4.1. Computing Mean and Median with an Odd Number of Data PointsFor this example, we want to compute the mean, median and mode of a set of test scores:55, 60, 67, 70, 78, 81, 84, 88, 90, 95, 99The mode is the most frequently occurring observation. Since none of the test scores arerepeated, there is no mode. We computed the mean of this data in example 1 (page 66) andfound it to be about 78.82. Computing the median of the data requires us to put the data inorder (this has been done already) and identify the data point in the middle of the orderedlist. There are 11 points, so we want the 6th data point (that leaves five numbers less thanthat observation and five greater than that observation). This makes the median 81, whichis slightly higher than the mean, indicating that many students did ”above average” on thetest. We call a distribution like this ”skewed to the left”, since the mean is smaller than (tothe left of) the median.55, 60, 67, 70, 78, 81, 84, 88, 90, 95, 99Lowest


View Full Document

SJFC MSTI 130 - Box-and-Whisker Plots

Download Box-and-Whisker Plots
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Box-and-Whisker Plots and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Box-and-Whisker Plots 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?