DOC PREVIEW
UF STA 6166 - SUMMARIZING DATA TABLES AND GRAPHICS

This preview shows page 1-2-3-4 out of 11 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Topic (3) SUMMARIZING DATA - TABLES AND GRAPHICSA) Frequency Distributions For Samples1) Summarizing Categorical DataBar ChartDot Plot: each dot represents one observation in the datasetExample: Fish LengthsStem-and-Leaf Plots: each dot includes the data value as pare.g. DDT (ppm) in 40 fish collected in the Tenn. R.Frequency Tables and HistogramsTopic (3) SUMMARIZING DATA - TABLES AND GRAPHICS 3-1 Topic (3) SUMMARIZING DATA - TABLES AND GRAPHICS A) Frequency Distributions For Samples Defn: A FREQUENCY DISTRIBUTION is a tabular or graphical display of the frequencies or number of occurrences of the values of a variable in the data set. EXAMPLE A DDT contamination study was done on a portion of the Tennessee River in 1980. The experiment involved sampling five fish at each of several locations along the river. For each fish, the variables measured were: location (miles upstream of the mouth of the river), species, length (centimeters), weight (grams), and DDT concentration in the fillet (ppm). Part of the data set follows. Obs Location Species Length Weight Concentration 1 275 catfish 48.0 986 8.40 2 275 catfish 45.0 1023 15.00 3 275 catfish 49.0 1266 25.00 4 280 catfish 51.0 1398 11.00 5 280 catfish 44.0 917 5.50 6 280 buffalofish 49.0 1763 4.50 7 280 buffalofish 46.0 1459 4.20 8 285 bass 28.5 778 0.48 9 285 bass 26.0 532 1.18 10 285 bass 25.5 441 2.34 11 285 bass 25.0 544 3.11 12 285 catfish 44.0 897 5.80Topic (3) SUMMARIZING DATA - TABLES AND GRAPHICS 3-2 1) Summarizing Categorical Data One-way Frequency Table Species Absolute Frequency Relative Frequency Catfish 6 6/12 = 0.5 Buffalofish 2 2/12 = 0.167 Bass 4 4/12 = 0.333 TOTAL 12 1.00 Bar Chart SPECIEScatfishbuffalofishbassCount7654321 2) Summarizing Continuous Data Dot Plot: each dot represents one observation in the dataset Example: Fish LengthsTopic (3) SUMMARIZING DATA - TABLES AND GRAPHICS 3-3 • • • ____••___•_|______|_____|____ •••__••|_•___ 25 30 35 40 45 50 Length (cm) Stem-and-Leaf Plots: each dot includes the data value as part of the graphic e.g. DDT (ppm) in 40 fish collected in the Tenn. R. Stem Leaf12 1111098037 3586 2675 12464 3356883 00122334666892 018991 3590Count1233461353Topic (3) SUMMARIZING DATA - TABLES AND GRAPHICS 3-4 To construct a stem-and-leaf plot: 1) find the max and min values in the dataset 2) decide which digits in a value are significant (“stem”) and which are less important (“leaves”) and which really do not provide much information (this part of the value is ignored or truncated out) e.g. Fish lengths min = 25, max = 51 and the digits are X X . X Frequency Tables and Histograms Can’t always list every possible value for quantitative variables or the datasets get too large. We wish to summarize the data in some way. So, we create groupings (intervals, bins, classes) and assign each observation to a grouping based on the value of its quantitative variable. 1) How many groupings or intervals (classes)? 55nnsobservatioofnumberc == as a rule of thumb but adjust as needed e.g. for n = 40, use 7 or 8 classes or binsTopic (3) SUMMARIZING DATA - TABLES AND GRAPHICS 3-5 2) How big is each class (bin)? a) Should be equal-sized b) Choose a starting value slightly below the min value in the dataset and an ending value slightly above the max value in the dataset b) Size of each class = cvaluestartingvalueending− adjusted as needed (see next step) e.g. DDT as shown in the stem-and-leaf plot ranges from 1.3 to 12.1 ppm with n = 40 Use range from 0 to 14 (instead of the actual min and max values) since it divides nicely with c = 7 3) Construct each class or grouping: FREQUENCY TABLE Grouping Absolute Frequency Relative Frequency 0-2 4 4/40 >2-4 17 17/40 >4-6 10 10/40 >6-8 7 7/40 >8-10 1 1/40 >10-12 0 0 >12-14 1 1/40 TOTAL 40 1.00Topic (3) SUMMARIZING DATA - TABLES AND GRAPHICS 3-6 Histogram (a graphical display of the frequency table) – display either the absolute or relative frequency Distribution of DDT 0 2 4 6 8 10 12 14 Quantiles 100.0% maximum 12.10075.0% quartile 5.55050.0% median 3.85025.0% quartile 3.0250.0% minimum 1.300 Moments Mean 4.5125Std Dev 2.2111822Std Err Mean 0.3496186upper 95% Mean 5.2196704lower 95% Mean 3.8053296N 40Topic (3) SUMMARIZING DATA - TABLES AND GRAPHICS 3-7 The Histogram, i.e. the frequency distribution, or stem-and-leaf plot plays an important role in statistical analysis. As a consequence we spend a lot of time and effort describing these distributions. The descriptions include: 1) Shape of the distribution (skew, modality, symmetry, gaps, and outlying or other unusual data points) CONCENTR25.020.015.010.05.00.06543210Std. Dev = 6.98 Mean = 7.2N = 12.00 -90-70-50-30-10Topic (3) SUMMARIZING DATA - TABLES AND GRAPHICS 3-8 X72.570.067.565.062.560.057.555.052.550.047.545.042.540.037.535.032.514121086420Std. Dev = 9.09 Mean = 52.1N = 99.00 Special name for distributions which follow a symmetric, unimodal shape with equal sized tails and with a specific curve between the mode and the tails: NORMAL DISTRIBUTIONTopic (3) SUMMARIZING DATA - TABLES AND GRAPHICS 3-9 TIME95.090.085.080.075.070.065.060.055.050.045.040.050403020100Std. Dev = 12.80 Mean = 71.0N = 222.00 2) Center of the Distribution (topic 4) 3) Spread of the Distribution (topic 5) B) Frequency Distributions for Populations Imagine you have the data for an entire population rather than just a sample from that population (census). N = population size >>> n = sample size. If we used the rule of thumb for number of bars needed we’d get an extremely large number:Topic (3) SUMMARIZING DATA - TABLES AND GRAPHICS 3-10 n = 25 n = 250 n=2500 The tops of the bars approach a smooth line – this is called the density curve of the population -20 20 60 100 140 180102468 Another Example N=40 -2 -1.5 -1 -0.5 0 .5 1 1.5 -10 3 16 29 42 55 68 81 94 107 120 133 146 159 172 185101020304050 N=400 -3 -2 -1 0 1 2 3 -106223854708610211813415016618219821423024626227829431032634235837410200400600Topic (3) SUMMARIZING DATA - TABLES AND GRAPHICS 3-11 N=40000 -4 -3 -2 -1 0 1 2 3


View Full Document

UF STA 6166 - SUMMARIZING DATA TABLES AND GRAPHICS

Documents in this Course
Exam 1

Exam 1

4 pages

Exam 1

Exam 1

4 pages

Exam 1

Exam 1

4 pages

VARIABLES

VARIABLES

23 pages

Exam #2

Exam #2

4 pages

Exam2

Exam2

6 pages

Sampling

Sampling

21 pages

Exam 1

Exam 1

4 pages

Exam 1

Exam 1

5 pages

Load more
Download SUMMARIZING DATA TABLES AND GRAPHICS
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view SUMMARIZING DATA TABLES AND GRAPHICS and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view SUMMARIZING DATA TABLES AND GRAPHICS 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?