Unformatted text preview:

Measures of CenterIntroductionNotationThree key measures of centerMeanMedianModeOther measures of centerSummaryAdditional examplesDefining Functions In R: Optional KnowledgeIntroductory Statistics LecturesMeasures of CenterDescriptive Statistics IIAnthony TanbakuchiDepartment of MathematicsPima Community CollegeRedistribution of this material is prohibitedwithout written permission of the author© 2009(Compile date: Tue May 19 14:48:21 2009)Contents1 Measures of Center 11.1 Introduction . . . . . . . 2Notation . . . . . . . . . 31.2 Three key measures ofcenter . . . . . . . . . . 4Mean . . . . . . . . . . . 4Median . . . . . . . . . 6Mode . . . . . . . . . . 71.3 Other measures of center 81.4 Summary . . . . . . . . 91.5 Additional examples . . 91.6 Defining Functions InR: Optional Knowledge 101 Measures of CenterR tipTab completion: type the first few letters of a variable or function’s name andR will complete it.12 of 11 1.1 Introduction1.1 IntroductionMeasures of centerRobert Pershing Wadlow (February 22, 1918 - July 15, 1940) is the tallestperson in medical history for whom there is irrefutable evidence. He is oftenknown as the “Alton Giant” because of his Alton, Illinois hometown.Wadlow reached an unprecedented 8 feet 11.09 inches (2.72 m) in heightand weighed 440 pounds (199 kg) at his death. His great size and his continuedgrowth in adulthood was due to hypertrophy of his pituitary gland which resultsin an abnormally high level of human growth hormone. He showed no indicationof an end to his growth even at the time of his death.Robert Wadlow compared to his father, Harold Franklin Wadlow.Make a new variable heights.skewed where Wadlow (97 inches) is addedto our class.R: lo ad ( ”ClassData . RData ”)R: h e i g h t = c l a s s . data $ h e i g h tR: h e i g h t . skewed = c ( h ei ght , 97)R: par ( mfrow = c ( 1 , 2) )R: h i s t ( h e i g h t )R: h i s t ( h e i g h t . skewed )Anthony Tanbakuchi MAT167Measures of Center 3 of 11Histogram of heightheightFrequency65 70 750 1 2 3 4 5Histogram of height.skewedheight.skewedFrequency60 70 80 90 1000 2 4 6 8Skewed distribution. Definition 1.1has left and right tails that are not symmetrical.positive longer right tail.negative longer left tail.negative (left) skewed, symmetrical, positive (right) skewedNOTATIONSummation operatorPoperatorA compact notation for summation:total =nXi=1xi(1)= x1+ x2+ · · · + xn(2)Summation:sum(x)Where x is a vector.R CommandMathematical Notationpopulation sampledata set x = {x1, x2, . . . , xN} x = {x1, x2, . . . , xn}sizeN nsum of data setPNi=1xiPni=1xifreq dist (k classes) fi= {f1, f2, . . . , fk}prop freq dist n =Pki=1fiAnthony Tanbakuchi MAT1674 of 11 1.2 Three key measures of centerR Notationpopulation sampledata set x=c(x1, x2, ...) x=c(x1, x2, ...)size N=length(x) n=length(x)sum of data set sum(x) sum(x)freq dist (k classes) f=c(f1, f2, ...)prop freq dist n=sum(f)Measures of center and effect of outliersHow are each of the measures of center effected by outliers?1.2 Three key measures of centerMEANMean: µ, ¯x.Definition 1.2The arithmetic mean is the sum of the data set divided by the numberof values.µ =PNi=1xiN(parameter: population mean)¯x =Pni=1xin(statistic: sample mean)The balance point on a distribution.How susceptible is the mean to outliers?Question 1. Given that x = {9, 5, 6, 4} find the mean of x.Length of a vector:length(x)Returns the number of elements in the vector x .R CommandExample 1. Find the mean of the student heights in R “manually”.R: n = l e n g t h ( h e i g h t )R: n[ 1 ] 18R: sum( h e ig h t ) /n[ 1 ] 6 7 . 61 1We can write this more compactly:Anthony Tanbakuchi MAT167Measures of Center 5 of 11R: sum( h e ig h t ) / l e ng t h ( h e ig h t )[ 1 ] 6 7 . 61 1Making your own mean function in RCreate a new function called my.mean(x) that takes one vector as an argu-ment:R: my . mean = f u n c t i o n ( x ) {+ sum( x ) / l e n g t h ( x )+ }Now use my.mean(x) to find the mean of the student heights.R: my . mean ( h e i g h t )[ 1 ] 6 7 . 61 1Mean:mean(x, trim=0)Find the mean of the vector x . If you set the optional argumenttrim=0.1 it will trim the top and bottom 10% of the data pointsbefore finding the mean.R CommandExample 2. Given that x = {9, 5, 6, 4} find the mean of x. This is easy in R!R: x = c ( 9 , 5 , 6 , 4 )R: mean ( x )[ 1 ] 6Example 3. Effect of skewed data:R: mean ( h e i g h t )[ 1 ] 6 7 . 61 1R: mean ( h e i g h t . skewed )[ 1 ] 6 9 . 15 8Example 4. Trimmed meanR: mean ( h e i g h t . skewed , tr im = 0 . 1 )[ 1 ] 6 7 . 94 1Mean from a frequency distributionIf don’t have the original data but you have a frequency distribution table orhistogram:¯x ≈Pki=1fi· ¯xin(3)=f1· ¯x1+ f2· ¯x2+ · · · + fk· ¯xkn(4)ficlass frequency (count), k classes¯xiclass midpointAnthony Tanbakuchi MAT1676 of 11 1.2 Three key measures of centerWhy is this only an approximation?Given a frequency distribution table1of student height data:Class Midpoints Frequency1 [62,64) 63.00 32 [64,66) 65.00 33 [66,68) 67.00 24 [68,70) 69.00 55 [70,72) 71.00 36 [72,74) 73.00 17 [74,76) 75.00 0Example 5. Approximation of mean from frequency distribution:R: f[ 1 ] 3 3 2 5 3 1 0R: m id p oin ts[ 1 ] 63 65 67 69 71 73 75R: n[ 1 ] 17R: x . bar = sum ( f ∗ m i dp o in t s ) /nR: x . bar[ 1 ] 6 7 . 58 8Compare our approximate mean above with the true mean of 67.6111111111111.MEDIANMedian ˜x.Definition 1.3The middle value of a sorted data set:1. Sort the values.2. Median value is xiwhere i =n+12. If i is not an integer, averageneighboring two values.Breaks frequency distribution into two equal areas.Median:median(x)Where x is a vector.R CommandExample 6. Given that x = {9, 5, 6, 4} find the median of x. This is easy in R!R: x = c ( 9 , 5 , 6 , 4 )R: median ( x )[ 1 ] 5 . 5Question 2. Given that x = {2, 5, 6, 10, 11} find the median of x.1Recall [a, b) = a ≤ x < b.Anthony Tanbakuchi MAT167Measures of Center 7 of 11Question 3. Given that x = {2, 5, 6, 10, 11, 14} find the median of x.Example 7. Effect of skewed data on median:R: median ( h e i g h t )[ 1 ] 68R: median ( h e i g h t . skewed )[ 1 ] 68Question 4. Is the median more or less resistant to outliers …


View Full Document

UA MATH 167 - Descriptive Statistics

Download Descriptive Statistics
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Descriptive Statistics and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Descriptive Statistics 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?