DOC PREVIEW
UNC-Chapel Hill STOR 155 - Lecture 4- Displaying Distributions with Numbers (II)

This preview shows page 1-2-3-27-28-29 out of 29 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1/25/11 Lecture 4 1 STOR 155 Introductory Statistics Lecture 4: Displaying Distributions with Numbers (II) The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL1/25/11 Lecture 4 2 Numerical Summary for Distributions • Center – Mean – Median – Mode • Spread – Quartiles, IQR, Five-number summary and Boxplot – Standard Deviation (starting from page14)1/25/11 Lecture 4 3 Examples: 2004 Two-Seater Cars • Highway mileages of the 21 two-seater cars: 13 15 16 16 17 19 20 22 23 23 23 24 25 25 26 28 28 28 29 32 66 • Q1 =18 • Q3 =28 • IQR = Q3 – Q1=10 • 1.5*IQR=15 • Q3+1.5*IQR=43 • Q1-1.5*IQR=3 • 66 is a suspected outlier.1/25/11 Lecture 4 4 The five-number summary • To get a quick summary of both center and spread, use the following five-number summary: Minimum Q1 M Q3 Maximum1/25/11 Lecture 4 5 Example: HWY Gas Mileage of 2004 Two-seater/Mini Cars • Two-seater – Five-number summary: • 13, 18, 23, 27, 32 • Mini-compact – Five-number summary: • 19, 23, 26, 29, 321/25/11 Lecture 4 6 Boxplots • a visual representation of the five-number summary. • A boxplot consists of – A central box spans the quartiles Q1 and Q3. – A line inside the box marks the median M. – Lines extend from the box out to the smallest and largest observations.1/25/11 Lecture 4 7 Boxplots of highway/city gas mileages (Two-seaters/minicompacts)1/25/11 Lecture 4 8 Pros and cons of Boxplots • Location of the median line in the box indicates symmetry/asymmetry. • Best used for side-by-side comparison of more than one distribution at a glance. • Less detailed than histograms or stem plots. • The box focuses attention on the central half of the data.1/25/11 Lecture 4 9 Income for different Education Level1/25/11 Lecture 4 10 Modified Boxplot • The current boxplot can not reveal those possible outliers. • To modify it, – the two lines extend out from the central box only to the smallest and largest observations that are not suspected outliers. – Observations more than 1.5*IQR outside the box are plotted as individual points.1/25/11 Lecture 4 11 Call length (seconds)1/25/11 Lecture 4 12 HG for count in a given time interval1/25/11 Lecture 4 131/25/11 Lecture 4 14 Sample Variance s2 • Deviation from mean: :the difference between an observation and the sample mean: • Sample Variance s2: the average of squares of the deviations of the observations from their mean xxi1)(1)(...)()(12222212nxxnxxxxxxsniin1/25/11 Lecture 4 15 Sample Standard Deviation s • Sample Standard Deviation s: the square root of the sample variance 1)(12nxxsnii1/25/11 Lecture 4 16 Toy Examples • Data: -2, -1, 0, 1, 2 • What is the sample variance and the standard deviation? • How about this? 40, 40, 40, 40, 401/25/11 Lecture 4 17 Remarks on the definition of Standard Deviation (S.D.) • The sum of the deviations of the obs from their mean is always 0. • Why “square the deviations” rather than “absolute deviations”? – Mean is a natural center under the “squaring”. – S.D. is a natural measure of spread for the normal distributions.1/25/11 Lecture 4 18 Remarks on S.D. • Why “S.D.” rather than “variance”? – S.D. is natural for measuring spread for normal dist. – S.D. is in the original scale. • Why “n-1” rather than “n”? – Intuitively speaking, S.D. is not defined for n=1. – Sum of deviations is always 0, which means “if we know (n-1) of them, we know the last one”. – Only (n-1) deviations can change freely. – n-1: degrees of freedom.1/25/11 Lecture 4 19 Properties of the standard deviation (S.D.) s • s measures the spread about the mean; • s should be used only when the mean is chosen to measure the center; • s=0 if and only if there is no spread; – When? • s>0 almost always, increases with more spread; • s, like the mean, is not resistant, i.e. sensitive to outliers.1/25/11 Lecture 4 20 Examples: 2004 Two-seater Cars Highway mileages of the 21 two-seater cars: 13 15 16 16 17 19 20 22 23 23 23 24 25 25 26 28 28 28 29 32 66 • Gasoline-powered cars – Mean: 22.6 – S.D.=5.3 • All cars – Mean: 24.7 – S.D.=10.81/25/11 Lecture 4 21 Three measures of spread • The range is the spread of all the observations; • The interquartile range is the spread of (roughly) the middle 50% of the observations; • S.D. is a measure of the distance from sample mean. S.D. can be regarded as a “typical” distance of the observations from their mean.1/25/11 Lecture 4 22 The five-number summary vs Mean and S.D. • The five-number summary is preferred for a skewed distribution or a distribution with strong outliers. • and s are preferred for reasonably symmetric distributions that are free of outliers. • Always plot your data first. • Use boxplots. x1/25/11 Lecture 4 23 Changing the unit of measurement • The same variable can be recorded in different units of measurement. • Distance: – Miles (US) vs Kilometers (Elsewhere) – 1 mile = 1.6 km – 1 km = ? mile • Temperature – Fahrenheit (US) vs Celsius (Elsewhere) – 0 F = -17.8 C – 100 F = 37.8 C – 212 F =100 C1/25/11 Lecture 4 24 Boiled Billy • An Australian student Billy has recently been on a trip to the States. Soon after he arrived there, he caught a cold and had a fever. • He went to see Doctor Z. Doctor Z measured his body temperature and told Billy, “Just relax! No big deal! It’s only a little above 100 degree!” • “100!!!”, Billy yelled, “How can you say it’s not a big deal? I am boiled…”1/25/11 Lecture 4 25 Linear Transformation • A linear transformation changes the original variable into a new variable according to the following equation, • Temperature: Celsius vs Fahrenheit – in Celsius, in Fahrenheit, – How about the inverse transformation? xnewx.bxaxnewxnewx.5932 xxnew1/25/11 Lecture 4 26 Effects of Linear Transformation • The shape of a distribution remains unchanged, except that the direction of the skewness might change. – When? • Measures of center and spread change. – Multiplying each obs by a positive number b multiplies both measures of center and spread by b; – Adding the same number a to each obs adds a to measures of center and to percentiles, but does not change measures of spread.1/25/11 Lecture 4 27 Example:


View Full Document

UNC-Chapel Hill STOR 155 - Lecture 4- Displaying Distributions with Numbers (II)

Documents in this Course
Exam 1

Exam 1

2 pages

Load more
Download Lecture 4- Displaying Distributions with Numbers (II)
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 4- Displaying Distributions with Numbers (II) and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 4- Displaying Distributions with Numbers (II) 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?