1QuestionWhat are data and what do they mean to a scientist?Dinner at the Urquhart HouseBrought to you by the Briggs Multiracial AllianceSunday nightAll food provided (probably Chinese)Contact Mimi Reddy, [email protected] detailsData, Statistics, and SpreadsheetsWhat are data?What are statistics?What are spreadsheets?How can you analyze data with spreadsheets?Data Data are pieces of information Data can be numbers, words, descriptions Data have UNITS The word data is PLURAL, datum is singular Data about Willoughby: • Age: 5 (years)• Height: 47 (inches)• Weight: 66 (pounds)• Eyes: Blue• Favorite word: Wrestle• Favorite letter: WTypes of DataNumbers – two types– Real #s – rational numbers – 28.75 lbs– Integers – whole numbers – 18 monthsLetters – called characters in programming– W is a characterWords – called strings in programming– “No thanks” is a strings, can be individual words or phrasesStatistics and Data Test Scores: – Jeff: 88– Mollie: 92– Marcie: 88– Dave: 47– Karim: 99– Willoughby: 42– Benjamin: 0 What statistics can you calculate to describe these data?– Try to think of four things to describe the data stop2StatisticsStatistics are derived from the data Statistics are descriptions of dataStatistics are meant to simplify the dataStatistics can be misleadingTypical Statistics Sample Size - number of individuals measured = n Sum = Σ Average or Mean = Σ/n Median – Value of 50th percentile, half of values fall above, half below Maximum, Minimum, Range (Max-Min) Mode - most common value Standard deviation Variance (SD2)Analyze these data... Mean, max, min, range, median, mode• 18• 33• 4• 47• 49• 38• 29• 4• 55 sample size (n) Sum Σ mean=average=Σ/n• denoted x median = halfway mode = most commonSpreadsheetsSpreadsheets are tables Spreadsheets allow calculations and manipulations of data• Calculations: mean, standard deviation• Manipulations: sort, CostaRica NicaraguaRainforest 625,000 3,712,000Dry Forest 50,000 300,000Total 675,000 4,012,000Make a data table: Fly 1, length 13.4 mm, velocity 27 Kph, age 21 days Fly 2, length 9.4 mm, velocity 0 Kph, age 220 days Fly 3, length 9.3 mm, velocity 44 Kph, age 1 days Fly 4, length 13.4 mm, velocity 17 Kph, age 32 days Fly 5, length 17.4 mm, velocity 33 Kph, age 11 days How many columns? How many rows? #s go down or across?Data TableAgeVelocityLengthFly #543213Microsoft ExcelTypical spreadsheet program– Lotus 1-2-3 is original commercial spreadsheetHas similar controls to MS WordNow allows graphing (charts)• very restricted formats, hard to get exactly what you wantExcel tables and graphs can be copied into MS WordFriday’s AssignmentWe will work with Microsoft Excel to analyze some dataGroups of two will submit one finished spreadsheet for the assignmentGraphsMany different types of graphs– Points– Lines– Bars– PiesPoint GraphsCalled X-Y Scatter in MS ExcelPlot points based on X and Y valueCan fit a “REGRESSION LINE” to the data– Line that best fits the dataX-Y Scatter Bar GraphsCategorize data into counts or percentsCategories can be descriptive categories (Windows 98, Windows 2000, …)Can also be numeric categories – Height: 60-63, 63-66, etc. or just 61, 62, 63…– Count up number of people in each groupHistograms are a particular type of bar graph4Bar GraphStarting Salary$0$10,000$20,000$30,000$40,000$50,0001988 1989 1990 1991 1992 1993 1994Starting SalaryHistogramX axis is categoriesY axis is a number or proportion of observations in that categoryHistogram Bar GraphNumber of CrashesRegular Bar Graph vs. Histogram Bar GraphStarting Salary$0$10,000$20,000$30,000$40,000$50,0001988 1989 1990 1991 1992 1993 1994Starting SalaryDistributionsSpecial type of histogram with continuous numeric scale at bottomNormal distribution is a key concept in statisticsSkewed distribution is one that is unbalancedSample distribution histogramsDanyoungyoo, Katanchalee, and Srichawla, www.s-t.au.ac.th/handout/st2204/week5-Univariate-Des.pptRobert D. Duval, PS 400 Lecture, www.polsci.wvu.edu/duval/ps400/Notes/400Notes.ppt5The NORMAL Distribution A NORMAL DISTRIBUTIONis the theoretical distribution of values given natural variation around a MEAN It is balanced, humped distributionDistributionsSkew is an imbalance in the distributionDanyoungyoo, Katanchalee, and Srichawla, www.s-t.au.ac.th/handout/st2204/week5-Univariate-Des.pptHypothesis TestingStatistical Tests are how scientists decide if data support their hypothesis (NOT PROVE their hypothesis)Four major statistical tests: T-test, X2 Test, Regression, ANOVAHypothesisProcessor speed has an effect on the performance of the computer.Null Hypothesis– H0: Processor speed has NO EFFECT on the performance of a computer.Statistical Tests and ProbabilityStatistical tests give a valueThat value can be related to a probabilityProbability is likelihood that NULL hypothesis is correct given the data you haveIf P < 0.05 (1/20), then you conclude NULL hypothesis is FALSET-TestCompares differences between two meansFormula: T = (x1-x2)/SEM– SEM is Standard Error of Mean [SD/(N-1)]T Values: Difference between mean in comparison to the amount of spread in your data6T-ValuesIf T > 2.5 or 3.0, difference is usually significant (this depends on your sample
View Full Document