DOC PREVIEW
UGA STAT 4210 - Chapter 2

This preview shows page 1 out of 4 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Chapter 2: Exploring Data with Graphs and Numerical SummariesDifferent types of observations yield different scales of measurement, leading to different types of variables. There are four scales of measurement, and the scale of measurement (or type of variable) we are using impacts the type of descriptive statistics we can use to summarize our sample.1. Nominal scale data are different in name only, they are purely classification data2. Ordinal scale data are ordered groups, where the order has meaning but there isn’t a measureof the difference between the groups3. Interval scale data are numeric measurements where the degree of difference between items is known, but the ratio is not 4. Ratio scale data are numeric measurements that are estimates of the ratio between a magnitude and a unit magnitude Nominal and Ordinal scale data are both categorical data, and so they can be plotted with pie charts and bar charts. Appropriate descriptive statistics for categorical data are frequency, proportion, and mode.Interval and Ratio scale data are both quantitative data, they can be graphically represented with histograms, stem-and-leaf plots, dot plots, and box plots. Appropriate descriptive statistics for quantitative data are mean, median, standard deviation, range, and interquartile range.Example: What kind of data are the following (categorical or quantitative)?- time for pain reliever to kick in (minutes)- sneaker preference- college GPA [0, 4]- area codes - outcome of coin toss- the number of heads in 10 coin tosses (0, 1, 2, …, 10)- outcome of the roll of a dieMeasures of CenterThe mean ( ´x=∑xin) is the arithmetic average. It is a function of all observations in the dataset. Very large (or very small) values are included, which influence its value. It is, therefore, a non-resistant statistic.The median ( ~x) is the halfway point, or the middle value, in the set of all ordered values. It is a function of the number of observations, so the magnitudes of the values themselves do not influence it. It is, therefore, a resistant statistic.Example: Median and Mean Income (2013) – Survey of Consumer Finances All Families Surveyed:Median ($) Mean ($)46,700 87,200By Geographical Region:Region Median ($) Mean ($)Northeast 58,300 107,200Midwest 44,200 75,200South 42,600 77,900West 50,700 98,300The mean is the center of mass, like the fulcrum on a teeter-totter. If there are more outliers, or extreme observations, it follows them to maintain equilibrium. The mean “chases the tail” of a distribution.Aside:If you have ever had the unfortunate experience in a class of doing really well on all homework assignments (or tests, or labs) except one, you’ve experienced the non-resistance of the mean firsthand. That one poor grade pulled down your homework (test, lab) average, despite your good marks on the others, because the mean is not a resistant statistic. The bigger the outlier, the bigger its influence.Measures of Spread (Dispersion)Besides being interested in where the data are centered, like the mean and median will tell us with quantitative data, we are also interested in how different (or varied) the data are. There are several ways of achieving this.The range (Range = max – min) tells us simply the width of the data.The standard deviation (x∑(¿ ¿i− ´x )2n−1=√∑(xi2)−n ´x2n−1=√∑((xi2)−(∑ xi)2)n−1s=√s2=√¿) is the square root of the variance, and is the typical difference between and observation and the mean.Standardized ValuesWhen one combines observations with information about a dataset’s center (mean) and spread (standard deviation), one can get standardized values for those observations. Standardized values allow for comparisons across different populations and for identification of outliers.Definition: a standardized value (z) for an observation is the number of standard deviations away from the mean the observation falls.zi=xi− ´xsThe standardized values, as the name implies, puts all measures on a standard scale.Example:Apples cost, on average, $1.50/lb, with a standard deviation of $.10/lb. This week they are on sale for$1.25/lb. Oranges have an average price of $2.70/lb, with a standard deviation of $.20/lb. This week they are on sale for $2.30/lb.Which is a better deal?Standardized values and the Empirical RuleObservations with standardized values > 3 or < -3 are observations that are more than 3 standard deviations away from the mean. Therefore, they are in the ~ 0.3% under the Empirical Rule, and are very unlikely. Thus, they are considered


View Full Document

UGA STAT 4210 - Chapter 2

Documents in this Course
Load more
Download Chapter 2
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Chapter 2 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Chapter 2 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?