DOC PREVIEW
UVA STAT 2120 - Topic_01

This preview shows page 1-2-3-21-22-23-42-43-44 out of 44 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 44 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 44 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 44 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 44 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 44 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 44 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 44 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 44 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 44 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 44 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Examining DistributionsExamining Distributions- IntroductionChapter 1Variables A variable records characteristics of idiid l/(ibj f i ) i iindividual/cases(i.e., objects of interest) in its values. A variable’s distribution describes the counts or relative proportions of its values.Examining DistributionsExamining Distributions- Describing Distributions with GraphsgpSection 1.1Some graphical statistics Bar graphs and pie charts describe the distribution of a categorical variableof a categorical variable. A Pareto chart is a bar graph with categories ordered by decreasing frequencyordered by decreasing frequency. Histograms are essentially bar graphs of a quantitative variablequantitative variable. Stemplots are back-of-the-envelope histograms d ith th di it f tit ti ldrawn with the digits of quantitative values. Time plots graph time series values by time.HistogramsUse equal bar-widths and “eyeball” for best pictureDecember2004state nemplo ment ratesDecember 2004 state unemployment rates.Interpreting histogramsLook for shape, center, and spread.Too much detail Visualize a smooth curve highlighting the overall patternDistribution shapesSymmetric distributionRight-skewed distributionComplex, multimodal distributionInterpreting histogramsLook for deviations, like outliers.Alaska FloridaStemplotStemLeavesSplit stemSplit stemDecember 2004 state unemployment rates.Examining DistributionsExamining Distributions- Describing Distributions with NumbersgSection 1.2Measure of center: the mean58.264.058.264.059.5 64.560.7 64.160.9 64.861.9 65.261.9 65.762.2 66.262.2 66.762.4 67.162.9 67.863.9 68.963.1 69.663.9Heights (in.) of 25 womenMeasure of center: the median110.6221.2331.64419110.6221.2331.6441.9Step 1: Sort x1, …, xn.441.9551.5662.1772.3882.39925551.5662.1772.3882.3992.5Step 2.a: If n is odd,M = middle value992.510 10 2.811 11 2.912 3.313 3.41413610 10 2.811 11 2.912 12 3.313 3.414 1 3.6M = 3.41413.615 2 3.716 3 3.817 4 3.918 5 4.11964215 2 3.716 3 3.817 4 3.918 5 4.119 6 4.2M = (3.3+3.4)/2 = 3.351964.220 7 4.521 8 4.722 9 4.923 10 5.324115620 7 4.521 8 4.722 9 4.923 10 5.324 11 5.6Step 2.b: If n is even,M = avg. of two middle values24115.625 12 6.1ComparisonsSymmetrySymmetryLeft skew Right skewObserve:Observe: The mean is “pulled” by outliers.The median isresistantto outliersThe median is resistantto outliers.Measure of spread: the quartilesQ1= 2.2110.6221.2331.6441.9551.5662.1772.3812.39225The first quartile, The third quartile, Qith di f922.510 3 2.811 4 2.912 5 3.313 3.4q,Q1, is the median of values below M.Q3, is the median of values above M.M = 3.4Q3= 4.3514 1 3.615 2 3.716 3 3.817 4 3.9185411854.119 6 4.220 7 4.521 1 4.722 2 4.923 3 5.324 4 5.625 5 6.125 6 6.1Max=61Five-number summary and boxplot24 5 5.623 4 5.322 3 4.921 2 4.720145Max 6.17Q3= 4.352014.519 6 4.218 5 4.117 4 3.91633856eathM = 3.41633.815 2 3.714 1 3.613 3.41263.334Years until de63311 5 2.910 4 2.8932.5822.312YQ1= 2.2712.3662.1551.5441.93316Disease X0331.6221.2110.6Min = 0.6Measure of spread: the standard deviation58.2 64.0, where59.5 64.560.7 64.160.9 64.861.9 65.261.9 65.762.2 66.262.2 66.762.4 67.162.9 67.863.9 68.963.1 69.663.9Note: Calculate by computerHeights (in.) of 25 womenSummarizing distributionsFive number summary Error barsQMaxMQ3QQ1Min(Resistant) (Not resistant)Examining DistributionsExamining Distributions-The Normal DistributionsThe Normal DistributionsSection 1.3Density curvesA density curve is a mathematical idealization of a histogramIdealizationActual“Adth”ti f b ti“Area under the curve” ≈proportion of observations.Other idealizationsHistogram Density curveMedian halves “area under the curve”The mean is the balance pointThe mean is the balance pointExamplesHave easy mathematical formulasformulasNo easy formulayNormal distributions“Exponential” functionThe normal curves:xxProperties: Symmetric, single-peaked, and bell-shaped. Indexed by μand σ, denoted N(μ, σ)yμ,(μ,) μ±σmark inflection points.Impact of μ and σSame μ, different σDifferent μ, same σμ,The 68-95-99.7 RuleIf x is N(μ, σ): 68% of obs. within μ±σ95% of obs. withinμ±2σ95% of obs. within μ±2σ 99.7% of obs. within μ±3σStandardizationA z-score measures the location of x from μ in units of σ,Key property: If x is N(μ, σ) then z is N(0, 1).Benefit: To calculate an “area under the curve” for N()tlttdN(0 1)N(μ, σ)translate to a z-score and use N(0, 1).“Standard Normal” distributionExample calculation: heightsProblem: Heights, x, is N(64.5, 2.5).Fht tifi di id l i67?Forwhat proportion of individuals is x< 67?Solution:Ask: How far is c = 67 from μ = 64.5 in units of σ = 2.5?(c – μ) / σ = (67 – 64.5) / 2.5 = 1Translate: z = (x – μ) / σ is N(0, 1)For what proportion of individuals isz<1?For what proportion of individuals is z 1?Calculate: normsdist(1) = 0.84Example calculation: heights (cont)68-95-99.7 rule: Proportion with-1<z<1is068Proportion with 1 < z< 1 is 0.68Equally divide remaining between z < -1 and z > 1Ptiith< 1 i 0 16 + 0 68 0 84Proportion with z< 1 is 0.16 + 0.68 = 0.840.680.160.16Calculation of “area between”Problem: Proportion with c1< z < c2Solution:(prop withz<c2)–(prop withz<c1)Solution:(prop. with z< c2) (prop. with z< c1)ElPtiith14<<22Example: Proportion with 1.4 < z< 2.2.normsdist(2.2) – normsdist(1.4)= 0.9861 – 0.9192 = 0.0669Backward calculationsProblem: For what c is p the proportion with z < c?Solution:c=normsinv(p)Solution: c normsinv(p)El0.68Examples:normsinv(0.84) = 1 0.160.16normsinv(0.16) = -1Example calculation: mpgProblem: MPG, x, of compact cars is N(24.7, 5.88).Fhtd 10% fth?Forwhat cdoes 10% ofcompact cars have x> c?Solution: First, normsinv(0.90) = 1.28Translate: z = (x – μ) / σ is N(0, 1)10% of compact carshave z > 1.28 = (c – μ) / σSolve:1.28=(c–24.7) / 5.88Solve:1.28 (c24.7) / 5.88 ⇒ c = 24.7 + (1.28)(5.88) =332= 33.2Examining RelationshipsExamining RelationshipsScatterplotsSection 2.1Examining relationshipsOften, individuals are measured in more than one variablevariable Fll th h bfFollow the same approach as before: Plot data and calculate numerical summaries Look for overall patterns and deviations Consider suitability of mathematical models (later)Examining relationshipsAdditional considerations:Do some variables tend to vary together?Do some variables tend to vary together? Do some variables explain variability in


View Full Document

UVA STAT 2120 - Topic_01

Download Topic_01
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Topic_01 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Topic_01 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?