Chapter 5 Chapter05 Presentation 1213 Understanding and Comparing Distributions Copyright 2009 Pearson Education Inc 1 The Big Picture Below is a histogram of the Average Wind Speed for every day in 1989 Chapter05 Presentation 1213 Copyright 2009 Pearson Education Inc 2 The Big Picture cont Which direction is this distribution skewed The high value 8 67 may be an outlier Median daily wind speed is 1 90 mph and the IQR is 1 78 mph Can we say more Chapter05 Presentation 1213 Copyright 2009 Pearson Education Inc 3 The Five Number Summary The five number summary of a distribution reports its median quartiles and extremes maximum and minimum Chapter05 Presentation 1213 Copyright 2009 Pearson Education Inc Max 8 67 Q3 2 93 Median 1 90 Q1 1 15 Min 0 20 4 Daily Wind Speed Making Boxplots A boxplot is a graphical display of the five number summary of a data set and makes a judgment about whether or not there are potential outliers in the data set Boxplots are particularly useful when comparing groups Chapter05 Presentation 1213 Copyright 2009 Pearson Education Inc 5 Constructing Boxplots 1 2 3 Max 8 67 Q3 2 93 Median 1 90 Q1 1 15 Min 0 20 Draw a single vertical axis spanning the range of the data Draw short horizontal lines at the lower and upper quartiles and at the median Then connect them with vertical lines to form two boxes Chapter05 Presentation 1213 Copyright 2009 Pearson Education Inc 6 Constructing Boxplots cont 4 Max 8 67 Q3 2 93 Median 1 90 Q1 1 15 Min 0 20 Erect fences around the main part of the data The upper fence is 1 5 IQRs above the upper quartile The lower fence is 1 5 IQRs below the lower quartile Chapter05 Presentation 1213 Copyright 2009 Pearson Education Inc 7 Constructing Boxplots cont 5 Max 8 67 Q3 2 93 Median 1 90 Q1 1 15 Min 0 20 Use the fences to grow whiskers Draw lines from the ends of the box up and down to the most extreme data values found within the fences Chapter05 Presentation 1213 Copyright 2009 Pearson Education Inc 8 Constructing Boxplots cont Max 8 67 Q3 2 93 Median 1 90 Q1 1 15 Min 0 20 6 Add the outliers by displaying any data values beyond the fences with special symbols Some software packages use a different symbol for far outliers that are farther than 3 IQRs from the quartiles Chapter05 Presentation 1213 Copyright 2009 Pearson Education Inc 9 Constructing Boxplots cont The final boxplot does not display the fences Chapter05 Presentation 1213 Copyright 2009 Pearson Education Inc 10 Wind Speed Making Boxplots cont Compare the histogram and boxplot for daily wind speeds How does each display represent the distribution Chapter05 Presentation 1213 Copyright 2009 Pearson Education Inc 11 Class Activity Drawing a Box Plot On the next page you will see a sample of some data collected from a previous Stat 201 class It represent the number of Facebook Friends these 32 students reported they had the data are already sorted You will also find some summary statistics needed to construct the box plot After drawing your box plot compare your work with your teammate Chapter05 Presentation 1213 Copyright 2009 Pearson Education Inc 12 IQR Q3 Q1 0 Chapter05 Presentation 1213 Upper Fence Q3 1 5 IQR 500 1000 1500 Copyright 2009 Pearson Education Inc 2000 Lower Fence Q1 1 5 IQR 2500 3000 3500 13 Comparing Groups Boxplots can be plotted side by side for groups or categories we wish to compare What do these boxplots tell you Chapter05 Presentation 1213 Copyright 2009 Pearson Education Inc 14 Comparing Groups Cont When making comparisons with histograms make sure the horizontal scales are the same The data used for the example on the next page represents the number of cigarettes hundreds made per day by 2 different machines over a 30 day period Which set of histograms makes the differences between these two machines clear Chapter05 Presentation 1213 Copyright 2009 Pearson Education Inc 15 Default Output Chapter05 Presentation 1213 Copyright 2009 Pearson Education Inc Common Horizontal Scales 16 Timeplots Order Please For some data sets we are interested in how the data behave over time In these cases we construct timeplots of the data Chapter05 Presentation 1213 Copyright 2009 Pearson Education Inc 17 Smoothing Timeplots Timeplots with lots of point to point variation are difficult to see the overall trends in the data A smooth trace of the data can be added to help see the overall trends that exist Chapter05 Presentation 1213 Copyright 2009 Pearson Education Inc 18 Smoothing Timeplots Cont A moving average of the original data is one way to smooth the data Original Data Chapter05 Presentation 1213 5 Item Moving Average Copyright 2009 Pearson Education Inc 15 Item Moving Average 19 Other Statistical Topics Related to Time Ordered Data Time Series Analysis looking for patterns in time ordered data Issues such as the existences of seasonality long term trends and the impact of the economy are addressed to allow for making reasonable forecasts of the future At UT Statistics 475 is devoted to this topic Statistical Process Control SPC using time ordered data to help businesses improve the quality of their services and or products At UT Statistics 340 contains material on this topic Chapter05 Presentation 1213 Copyright 2009 Pearson Education Inc 20 Other Statistical Topics Related to Time Ordered Data Cont The primary tool of SPC is the Control Chart A control chart is a timeplot with the average and Control Limits reported The control limits define the amount of variation in the data that can be attributed to chance variation Points outside the control limits probably have some sort of explanation for their behavior Chapter05 Presentation 1213 Copyright 2009 Pearson Education Inc 21 Beware of Misleading Timeplots Time is on the x axis in this image What is on the y axis Chapter05 Presentation 1213 Copyright 2009 Pearson Education Inc 22 Histograms vs Timeplots Winning Times in the Kentucky Derby in Seconds from 1896 to 2008 What does the timeplot run chart reveal that the histogram does not Chapter05 Presentation 1213 Copyright 2009 Pearson Education Inc 23
View Full Document