5/11/10 Lecture 1 1STOR 155 Introductory StatisticsLecture 1: Overview; Displaying Distributions with GraphsThe UNIVERSITY of NORTH CAROLINAat CHAPEL HILL5/11/10 Lecture 1 2Tip: Strategy for Success• Stay active/involved in class. • Ask questions during class (especially if you do not understand something).• Answer questions to help other students if you can.• Keep pace with the lectures, review daily, do homework after each lecture to help understand the materials.• Make effective use of office hours (Instructor), open tutorial sessions, UNC Learning Center.– Help you answer questions about homework and lectures– Private time vs. public time5/11/10 Lecture 1 3What is Statistics?Statistics: the science of collecting, organizing, analyzing and interpreting data (= information)Inference about population (using statistical tools)PopulationSample of data5/11/10 Lecture 1 4SAT Scores• Some parents and teachers have been concerned about the trend of declining SAT scores …• Question: effect of classroom atmosphere (strict or liberal) ?• To answer the question, 50 students (24 males and 26 females) participated in a study on their performance, as measured by SAT scores at the end of the school year.• The students were divided into two groups of 25 each (12 males and 13 females), with Group 1 to study under a strictatmosphere while Group 2 under a very permissiveatmosphere. • They were matched according to socio-economic background.5/11/10 Lecture 1 5SAT Scores• After 9 months, all students were given the same standardized tests: verbal and math.Student Group Gender SATMath SATVerA Strict F 670 700B Strict M 700 680C Liberal F 750 730D Liberal M 690 750… … … … …5/11/10 Lecture 1 6SAT Scores• This example involves data collection, data analysis, and statistical inference.– How?• Questions:– Does stricter classroom atmosphere increase the average score?– Why “matched according to socio-economic background”?– Why “12 males and 13 females per group”?– Is the group size 50 large enough to make a confident conclusion?5/11/10 Lecture 1 7Fundamental Concepts• Population: the entire group of individuals that we want information about.– Students (who are about to take SAT)• Sample: a part of the population that we actually examine in order to gather information.– those students selected into the study• Sample size: number of observations/individuals in a sample.– 50• Statistical inference: to make an inference about a population based on the information contained in a sample.– Based on the data from the study, to infer whether a stricter classroom atmosphere increases SAT scores in general.5/11/10 Lecture 1 8Fundamental Concepts• A parameter is a value that describes the population. It’s fixed but unknown in practice.– the average SAT score of all the students, who are about to take SAT.• A statistic is a value that describes a sample. It’s known (calculated) from the sample.– the average SAT score of all the students, who are selected into the study.– a sample analogue of the parameter.5/11/10 Lecture 1 9Practice Exercise• Suppose you are interested in finding the average SAT score of UNC unders,-- SAT scores of all UNC unders in STOR155 (sample)-- SAT scores of all UNC unders (population)• Suppose you are interested in finding the average SAT score of US unders,– SAT scores of all UNC unders ( )– SAT scores of all US unders ( )5/11/10 Lecture 1 10Summary• Statistics is the science of data:– Collecting – Organizing and analyzing– Decision making= Information processing• Fundamental concepts:– Population, parameter, sample, statistic, sample size• You can do a LOT with statistics … what ?Take home message• Interested in population, but it’s too large to become known completely• Statisticians work on sample, which is a smaller and observable ``proxy’’• There is uncertainty in this transition, hence errors are inevitable … • That’s why statistical methods are needed …5/11/10 Lecture 1 115/11/10 Lecture 1 12Chapter 1: Looking at Data - Distributioins1.1 Displaying Distributions with Graphs1.2 Displaying Distributions with Numbers1.3 Density Curves and Normal Distributions5/11/10 Lecture 1 13Datacontain– Individuals: the subjects described by the data;– Variables: any characteristic of an individual. A variable can take different values for different individuals.5/11/10 Lecture 1 14Categorical & Quantitative Variables• A categorical variable places an individual into one of several groups or categories.• A quantitative variable takes numerical values for which arithmetic operations such as adding and averaging make sense.5/11/10 Lecture 1 15NBA Draft 2005• Categorical variables– Team, Nationality • Quantitative variables– Weight, HeightName Team Nationality Weight HeightA. Bogut Milwaukee Australia 245 7-0M. Williams Atlanta US 230 6-9D. Williams Utah US 210 6-3C. Paul New Orleans US 175 6-0R. Felton Charlotte US 198 6-1…5/11/10 Lecture 1 16NBA Draft 2005• Variables: – Team & Nationality - Categorical– Weight & Height - Quantitative• How many teams in the draft? How many players drafted by each team?• How many players higher than 6-9? How many players between 200 and 250 pounds?• Equivalently, what is the distribution for each variable?5/11/10 Lecture 1 17Distributions of Variables• The distribution of a variable indicates what values a variable takes and how often it takes these values.– For a categorical variable, distribution: categories + count/percent for each category– For a quantitative variable, distribution: pattern of variation of its values5/11/10 Lecture 1 18Highest Level of Education for People Aged 25-34Education Count (millions) PercentLess than high school 4.6 11.8High school graduate 11.6 30.6Some college 7.4 19.5Associate degree 3.3 8.8Bachelor’s degree 8.6 22.7Advanced degree 2.5 6.65/11/10 Lecture 1 19Exploratory Data Analysis (EDA)• Use statistical tools and ideas to help us examine data• Goal: to describe the main features of the data• NEVER skip this• EDA– Displaying distributions with graphs– Displaying distributions with numbers5/11/10 Lecture 1 20Basic Strategies for EDA Strategy I1. One variable at a time2. Relationships among the variables Strategy II1. Graphical visualizations2. Numerical summaries5/11/10 Lecture 1 21Graphic Techniques for Categorical Variables• Bar Graph uses
View Full Document