9/1/09 Lecture 2-2 1STOR 155 Introductory StatisticsLecture 2-2: Displaying Distributions with GraphsThe UNIVERSITY of NORTH CAROLINAat CHAPEL HILL9/1/09 Lecture 2-2 2Recall• Data:– Individuals– Variables• Categorical variables• Quantitative variables• Distribution of variables• Graphical tools for categorical data– Bar graph– Pie chart• Graphical tools for quantitative data– Stemplot9/1/09 Lecture 2-2 3Example: A study on litter size• Data: (170 observations)4 6 5 6 7 3 6 4 4 6 4 4 9 5 10 6 6 5 6 8 2 7 7 7 9 3 7 5 7 7 4 5 5 6 7 6 7 8 6 6 7 6 6 7 5 4 5 6 6 1 3 4 7 5 4 7 5 8 8 5 6 8 5 5 4 9 6 7 3 7 7 5 4 6 9 6 7 7 5 7 3 7 6 5 3 7 10 5 6 8 7 5 5 7 5 5 8 9 7 5 7 5 5 5 6 3 7 8 7 7 6 3 4 4 4 7 2 7 8 5 8 6 6 5 6 4 7 5 5 6 9 3 5 4 8 3 9 8 3 6 5 4 7 8 4 8 6 8 5 6 4 3 8 8 6 9 5 5 6 6 7 6 8 6 11 6 5 6 6 39/1/09 Lecture 2-2 4Stem-and-leaf plot for pups0|122333333333333344… (35)0|555555555555555555555555... (132)1| 0019/1/09 Lecture 2-2 5Histogram• breaks the range of the values of a quantitative variable into intervals and displays only the count or percent of the observations that fall into each interval.• You can choose any convenient number of intervals.• Intervals must be of equal width(except at the two ends ?)9/1/09 Lecture 2-2 6Example: A study on litter size9/1/09 Lecture 2-2 7Data analysis in action: show steps in doing HG …9/1/09 Lecture 2-2 8Data analysis in action: count9/1/09 Lecture 2-2 9Example: Call Center Data• Financial firm call center• Calls handled by Avi within 60 seconds– October: 666– December: 5239/1/09 Lecture 2-2 10OctoberHistogram0204060801001206 12 18 24 30 36 42 48 54 60calling timeFrequencyFrequency9/1/09 Lecture 2-2 11DecemberHistogram0204060801001206 12 18 24 30 36 42 48 54 60calling timeFrequencyFrequency9/1/09 Lecture 2-2 12Notes for Making Histogram• Choose the number of classes sensibly (Fig 1.4, 1.8).• Intervals must be of equal width.• Areas of the bars are proportional to the frequency.9/1/09 Lecture 2-2 13Examining Distributions• Overall Pattern– Shape– Center (numerical, Lecture 3)• midpoint– Spread (numerical, Lecture 3)• range• Deviations– Outliers: some values that fall outside the overall pattern.9/1/09 Lecture 2-2 14Shapes of Distributions• Graphs can help to determine shapes.– Modes: local peaks of a distribution.• Unimodal: one peak• Bimodal: two peaks– Symmetric or skewed?9/1/09 Lecture 2-2 15Shakespeare’s Words: Uni-modal9/1/09 Lecture 2-2 16Tuition and fees: bimodal or trimodal9/1/09 Lecture 2-2 17A bimodal histogramA modal class A modal class9/1/09 Lecture 2-2 18Right skewedLeft skewed9/1/09 Lecture 2-2 19Iowa Test of Basic Skills vocabulary scores9/1/09 Lecture 2-2 20A study on litter size9/1/09 Lecture 2-2 21Bell-shaped Histograms9/1/09 Lecture 2-2 22Summary: Shapes of Distributions• Symmetric: – histogram in which the right half is a mirror image of the left half.• Skewed to the right: – histogram in which the right tail is more stretched out than the left.(long tail to the right)• Skewed to the left: – histogram the left tail is more stretched out than the right.(long tail to the left)• Number of modal classes: – the number of distinct peaks in a histogram• Bell-shaped: – A histogram looks like a bell.9/1/09 Lecture 2-2 23Time plots• A time plot of a variable plots each obs against the time at which it was measured.– Time: x-axis– Variable: y-axis– Examples: stock price, unemployment rate, daily temperature– Great for identifying changing patterns over time.• What to look for– Trend– Seasonal variations– Major deviations9/1/09 Lecture 2-2 24Example: Number of Suicides in USA (1900-1970)9/1/09 Lecture 2-2 25Call Center: Daily Call Volume in Sep. 2002010000200003000040000500006000070000# of Calls for Agent0 5 10 15 20 25 30Date (in September)Time Plot of # of Calls for Agent By Date (in September)9/1/09 Lecture 2-2 26Outliers• Observations that lie outside the overall pattern of a distribution.• Possible reasons:– error in data entry (most likely reason) • Equipment failure • Human error• Missing value code– extraordinary individuals (Jordan’s salary)9/1/09 Lecture 2-2 27Handling Outliers• Detect it using graphical and numerical methods.• Check the data to make sure correct entry.• Reducing influence of outlier– delete the observation (BE CAREFUL!)– Use transformations, robust methods.9/1/09 Lecture 2-2 28Call Center: Daily Call Volume in Sep. 2002010000200003000040000500006000070000# of Calls for Agent0 5 10 15 20 25 30Date (in September)Time Plot of # of Calls for Agent By Date (in September)9/1/09 Lecture 2-2 29Take Home Message• Examine distributions:– Overall pattern• Shape– Symmetric or skewed– How many modes?– Bell-shaped– Outliers• Graphical tools for quantitative data– Histograms– Time
View Full Document