Unformatted text preview:

Univariate Descriptive StatisticsLecture OverviewTabular and Graphical TechniquesFrequency TablesHistogramsCumulative Frequency HistogramsKey ConceptsRules For Bin SizesThe Effect of ClassificationSlide 10Slide 11Slide 12Slide 13Bimodal DistributionMultimodal DistributionSkewMeasures of Central TendencyDefinitionsSlide 19Description of MeanSymbolsSummation Notation: ComponentsMathematical Notation of MeanSlide 24Equation for MeanExample Mean CalculationsSlide 27Mean annual precipitation (mm)Slide 29Measures of DispersionMeasures of variationSlide 32Slide 33Sample Variance and Standard DeviationNext ClassUnivariate Descriptive StatisticsChapter 2Lecture OverviewTabular and Graphical TechniquesDistributionsMeasures of Central TendencyMeasures of DispersionTabular and Graphical TechniquesFrequency Tables–Ungrouped–GroupedHistogramsCumulative Frequency HistogramFrequency TablesBin Frequency170 3180 7190 8200 9210 12220 6230 6240 4250 2260 3HistogramsNote: sometimes percent is on the Y axis rather than frequencyCumulative Frequency HistogramsKey ConceptsChoosing Intervals (i.e., choosing your “bins”)Rules from the textbook (pages 38 – 39)Commonly Used Examples from GIS–Equal Interval–Quantiles (e.g., quartiles and quintiles)–Natural Breaks–Standard DeviationRules For Bin SizesNote: This is very relevant for GIS Rule 1: Use intervals with simple boundsRule 2: Respect natural breakpointsRule 3: Intervals should not overlapRule 4: Intervals should be the same widthRule 5: Select an appropriate number of classesThe Effect of ClassificationEqual Interval–Splits data into user-specified number of classes of equal width–Each class has a different number of observationsThe Effect of ClassificationQuantiles–Data divided so that there are an equal number of observations are in each class–Some classes can have quite narrow intervalsThe Effect of ClassificationNatural Breaks–Splits data into classes based on natural breaks represented in the data histogramThe Effect of ClassificationStandard Deviation–Mean + or – Std. Deviation(s)Key ConceptsMaking sense of your histograms using distributions–Rectangular–Unimodal–Bimodal–Multimodal–Skew (positive and negative)Bimodal DistributionMultimodal DistributionSkewAn asymmetrical distributionMeasures of Central TendencyMeasures of central tendency–Measures of the location of the middle or the center of a distribution–Mean, median, mode, midrangeDefinitionsMidrangeModeMedian–QuantilesMeanDefinitionsSample MeanPopulation MeanDescription of MeanMean – Most commonly used measure of central tendencyAverage of all observationsThe sum of all the scores divided by the number of scoresNote: Assuming that each observation is equally significantSymbolsn : the number of observationsN : the number of elements in the whole populationΣ : this (capital sigma) is the symbol for sumi : the starting point of a series of numbersX : one element in our dataset, usually has a subscript (e.g., i, min, max) : the sample mean : the population meanxSummation Notation: Componentsniiix1indicates we are taking a sumrefers to where the sum of terms beginsrefers to where the sum of terms endsindicates what we are summing upMathematical Notation of MeanThe mathematical notation used most often in this course is the summation notationThe Greek letter capital sigma is used as a shorthand way of indicating that a sum is to be taken:niiix1nxxx 21The expression is equivalent to:A summation will often be written leaving out the upper and/or lower limits of the summation, assuming that all of the terms available are to be summedSummation Notation: Simplification  niiiniiixxx1 1Equation for MeanNxNii1nxxnii1Sample mean:Population mean:Example Mean CalculationsExample I–Data: 8, 4, 2, 6, 1065)106248(551iixxExample II–Sample: 10 trees randomly selected from Battle Park–Diameter (inches): 9.8, 10.2, 10.1, 14.5, 17.5, 13.9, 20.0, 15.5, 7.8, 24.538.1410)5.242.108.9(10101iixxExample Mean CalculationsExample IIIMonthly mean temperature (°F) at Chapel Hill, NC (2001).70.59xAnnual mean temperature (°F)Mean annual precipitation (mm)Mean annual temperature (°F)58.51 (°F)Mean1198.10 (mm)MeanExamples IV & VChapel Hill, NC (1972-2001)Advantage–Sensitive to any change in the value of any observation Disadvantage–Very sensitive to outliersExplanation of Mean# Tree Height(m)# Tree Height(m)1 5.0 6 5.32 6.0 7 7.13 7.5 8 25.44 8.0 9 7.55 4.8 10 4.5Mean = 6.19 m without #8Mean = 8.10 m with #8Measures of DispersionUsed to describe the data dispersion/spread/variation/deviation numericallyUsually used in conjunction with measures of central tendencyMeasures of variationscore# of obsscoreLow variation High variationGroups have equal means and equal n, but one varies more than the otherDefinitionsRangeMean DeviationVarianceStandard DeviationCoefficient of VariationPearson’sSymbolss2: the sample varianceσ2 : the population variances : the sample standard deviationσ : the population standard deviationSample Variance and Standard Deviation1)(s12nxxniiNote: as with the mean there are both sample and population standard deviations & variances1)(1nxxsniiVariance Standard DeviationNext ClassRead chapter 3Work on the homeworkCome with questionsBring your


View Full Document

UNC-Chapel Hill GEOG 391 - Univariate Descriptive Statistics

Documents in this Course
Load more
Download Univariate Descriptive Statistics
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Univariate Descriptive Statistics and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Univariate Descriptive Statistics 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?