GRINNELL MAT 209 - Chapter 1.1 Displaying Distributions with Graphs

Unformatted text preview:

Chapter 1.1 Displaying Distributions with GraphsGraphs for Categorical Variables Univariate (single variable) DataGraphs for Quantitative (Univariate Data)Histograms1) Plot your dataChapter 1.2 – Describing Distributions with NumbersBoxplotMean and standard deviationCan replace the frequency histogram with a smooth curve--very large sample sizes-smaller and smaller class sizesDATA: numbers with a contextSTATISTICS: the process of collecting, organizing, and drawing conclusions from data.Chapter 1.1 Displaying Distributions with GraphsINDIVIDUAL/ ELEMENT/ ITEM / OBSERVATIONAL UNIT: objects described by a set of data (person, place, thing, animal) –should have a unique identifierVARIABLE: any characteristic of an individual / item--There can be many different variables for each individualEach row represents an individual while each column lists a characteristic of that individualCard # Season Price Foil Production Site Sales Rating129cadmds Christmas 1.99 Yes Topeka 2329hrdfma Birthday 2.99 Yes Lawrence 1199hcmsad Everyday 2.45 No Topeka 6CATEGORICAL DATA places individuals into one of several groups (red, blue, white, yes, no, etc..)QUANTITATIVE DATA takes numerical values for which arithmetic operations make sense DISTRIBUTION: Pattern of variation of a variable. tells us what values it takes and how often it takes these values.Exploratory Data Analysis: examining data in order to determine main featuresGraphs for Categorical Variables Univariate (single variable) Data Bar Chart Create horizontal or vertical bars on an xy axisPie Chart need to include all categories Angle = percent*360 / 100Beware of Pictogram Graphs for Quantitative (Univariate Data) HistogramsDivide range into classes of equal width—5 to 10 classesCount the individuals in each class (or use percents)- frequency (relative frequency)Draw the bars—Bars need to be vertical and connected on horizontal axis—unless a class is emptyStem and Leaf Plots{sort data first}Separate data into a stem (all but the final digit) and a leaf (rightmost digit)Write the stems from smallest to largestWrite the leaf in increasing order out from the stem ---evenly spaced!!!- 1 -# of Yankees home runs for Babe Ruth from 1920 to 193454 59 35 41 46 25 47 60 54 46 49 46 41 34 22 22 25 34 35 41 41 46 46 46 47 49 54 54 59 60 Yankee’s Roger Maris home runs 8 13 14 16 23 26 28 33 39 61Time Plots - time is always on the horizontal axis1) Plot your data2) Overall pattern (trend) {unimodal, bimodal}look for Shape, Center, and Spread {symmetry and skewness}3) Deviations from the overall patternOutlier is an individual observation that falls outside the overall pattern on the graphChapter 1.2 – Describing Distributions with NumbersMedian and Quartiles1) Arrange all observations in order2) Median = (n+1)/2If odd the centerIf even average the 23) Q1 = median of all observations left of the median4) Q3 = median of all observations right of the median5 Number Summary smallest, Q1, Median, Q3, LargestBabe Ruth 22 25 34 35 41 41 46 46 46 47 49 54 54 59 60 Roger Maris8 13 14 16 23 26 28 33 39 61# of Home Runs McGwire9 9 22 32 33 39 39 42 49 52 58 65 70Interquartile range {IQR = Q3 – Q1}Boxplot 1) Use Q1 and Q3 to draw a box2) Mark Median3) Draw fences Q1-1.5*IQR and Q3+1.5*IQR4) Line extends to smallest and largest number within the fences5) Dots represent outliers- 2 -exy221212000 LA Lakers salaries 0.3 0 .7 0.8 1.0 1.0 2.0 2.1 3.1 4.3 4.5 5.0 11.8 17.1Find the 5 number summaryDraw a boxplot.Find the mean and variance Is the data skewed left or right?Mean and standard deviationXbar = sum of x / nVar = sum(x-xbar)2/(n-1) {sample vs population variance}Std dev = sqrt of var measures spread (s = 0 then no spread)McGwire mean = 39.92 369.9 19.23Maris mean = 26.1 243.65 15.61Ruth mean = 43.93 126.495 11.25Chapter 1.3 – The Normal DistributionCan replace the frequency histogram with a smooth curve --very large sample sizes-smaller and smaller class sizes--this finds the likelihood of finding any range of valuesDensity Curves: Describe the overall pattern of a distributionThe {relative}frequency of every class is greater than or equal to 0The area underneath the curve {bars} is exactly 1 (exactly 100%). Median: cut data in half, Mean: balance pointMode is the peak pointsymmetric then mean = medianskewed right then mean > medianif Shaquille and Kobe only made 5 million, then mean = 2.79) and  completely describe the Normal distribution (for any x value put into the following equation, you can calculate y)Cannot always assume the Normal distribution. Even though it 1) describes many real data sets2) approximates certain chance outcomes 3) many statistical inference procedures are based on the normal distribution +/-  68% +/- 2 95% +/- 3 99.7%On the same graph plot:- 3 -N(3,2) {the Normal distribution with a mean of 3 and standard deviation of 2} and N(5,1) {the Normal distribution with a mean of 5 and standard deviation of 1}Standard Score Z = (X - ) / SAT ~ N(500, 100)P(X > 600) = P(Z > 1) = 1 – 84.13 = .1587P(X < 600) =.8413What is the 90th percentileP((X - )/ < c) = .9 c = 1.28 Then 1.28*100 + 500 = 628ACT~ N(18, 6)1) Find P(X > 21) 2) Find P(X < 16)  3) Find the 90th Percentile 4) Find the 80th Percentile- 4


View Full Document

GRINNELL MAT 209 - Chapter 1.1 Displaying Distributions with Graphs

Download Chapter 1.1 Displaying Distributions with Graphs
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Chapter 1.1 Displaying Distributions with Graphs and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Chapter 1.1 Displaying Distributions with Graphs 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?