1Stat 13, UCLA, Ivo Dinov Slide 1UCLA STAT 13Introduction toStatistical Methods for the Life and Health Sciences!Instructor: Ivo Dinov, Asst. Prof. In Statistics and Neurology!Teaching Assistants:Ming Zheng, Annie CheUCLA StatisticsUniversity of California, Los Angeles, Winter 2004http://www.stat.ucla.edu/~dinov/courses_students.htmlStat 13, UCLA, Ivo Dinov Slide 2UCLA STAT 13to just hear is to forgetto see is to rememberto do it yourself is to understand …(… to NOT go to class is to … not pass …)Stat 13, UCLA, Ivo Dinov Slide 3What is Statistics? A practical example!Michael Benton & Francisco Ayala, Dating the Tree of Life, Science 2003 300: 1698-1700!Molecular vs. Paleontological dating of major branching points in the tree of life are debated !Molecular date estimates are up to twice as old (due to statistical bias) as Paleontological dates (missing fossils).!Goals: Same as that set out by Darwin:to understand where life came from, the shape of evolution,the place of humans in nature and to determine the extentof modern biodiversity and where it is threatened.Stat 13, UCLA, Ivo Dinov Slide 4What is Statistics? A practical example!Plants: The first vascular land plants are found as fossils in the Silurian,and earlier evidence from possible vascular plant spores mayextend the range back to the Ordovician, 475 Ma considerably < a molecular estimate of 700 Ma.!Birds: Molecular estimates place the split of basal cladesand modern orders at 70 to 120 Ma. The oldest uncontroversialfossils of modern bird orders date from the Paleocene (60 Ma), much younger.!Mammals: Molecular dates split of modern placentals in the mid- to Late Cretaceous (80 to 100 Ma). The oldest fossil representatives of modern mammals dated from the Paleocene and Eocene (50 to65 Ma).Stat 13, UCLA, Ivo Dinov Slide 5What is Statistics? Topics!!It is proposed that molecular dates are correct (withconfidence intervals) and that methods exist to correct for that error. However, critics have pointed out several pervasive biasesthat make molecular dates too old. !First, if calibration dates are too old, then all other dates estimated from them will alsobe too old.!A second biasing factor is that undetected fast-evolving genes could bias estimates of timing. Empirical and statistical studiesof vertebrate sequences suggest that such non-clock-like genes may be detected and that they do not affect estimates of dating. However, statistical tests may have low power and could produce consistently > dates. Stat 13, UCLA, Ivo Dinov Slide 6What is Statistics? A practical example!A 3rdsource of bias relates to polymorphism. Two speciesoften become fixed for alternative alleles that existed as apolymorphism in their ancestral species. !A 4thbiasing factor is that molecular time estimates show (skewed)asymmetric distributions,with a constrained (largenumbers) younger left-endbutan unconstrained (smaller numbers) olderright-end.2Stat 13, UCLA, Ivo Dinov Slide 7What is Statistics? Estimate Variation!Data Source Metazoa (Animals)Bilateria (metazoans excepsponges, e.g., anemones) Deuterostomia(backboned animals) Gene (8 G) 1200 ± 100 1001 ± 100 Protein (64 E) 930 ± 115 790 ± 60 590 Gene (4 G) 940 ± 80 700 ± 80 Gene (18 G) 670 ± 60 600 ± 60 Gene (22 G) 830 ± 55 Gene (50 G) 1350 ± 150(est.) 993 ± 46 Gene (22 G) 659 ± 131 Protein (10 E) 627 ± 51 Gene (MtDNA; 18S rRNA) 588 min. 586/589 min. Stat 13, UCLA, Ivo Dinov Slide 17Chapter 1: What is Statistics?!Polls and surveys – we’re all different; It’s impossible or expensive to investigate every single person.!Experimentation – sample vs. population!Observational Studies – selection and non-response bias!Statistics -- What is it and who uses it?!Summary TextbookChris Wild & George SeberStat 13, UCLA, Ivo DinovSlide 18Newtonial science vs. chaotic science!Article by Robert May, Nature, vol. 411, June 21, 2001!Science we encounter at schools deals with crisp certainties(e.g., prediction of planetary orbits, the periodic table as a descriptor of all elements, equations describing area, volume, velocity, position, etc.)!As soon as uncertainty comes in the picture it shakes the foundation of the deterministic science, because only probabilistic statements can be made in describing a phenomenon (e.g., roulette wheels, chaotic dynamic weather predictions, Geiger counter, earthquakes, etc.)!What is then science all about – describing absolutely certain events and laws alone, or describing more general phenomena in terms of their behavior and chance of occurring? Or may be both!Stat 13, UCLA, Ivo DinovSlide 1950 60 70 80 90Samples of 20 peopleSamples of 500 peopleSample percentageTarget: True populationpercentage = 69%Figure 1.1.1Comparing percentages from 10 different surveys each of20 people with those from 10 surveys each of500 people (all surveys from same population).From Chance Encounters by C.J. Wild and G.A.F. Seber, © John Wiley & Sons, 2000.Variation in sample percentagesPoll: Do you consider yourselfoverweight? 1010We are getting closer toThe population mean, asis this a coincidence?∞→nStat 13, UCLA, Ivo DinovSlide 20Errors in Samples …! Selection bias: Sampled population is not a representative subgroup of the population really investigated.! Non-response bias: If a particular subgroup of the population studied does not respond, the resulting responses may be skewed.! Question effects: Survey questions may be slanted or loaded to influence the result of the sampling.! Is quota sampling reliable? Each interviewer is assigned a fixed quotaof subjects (subjects district, sex, age, income exactly specified, so investigator can select those people as they liked).! Target population –entire group of individuals, objects, units we study.! Study population –a subset of the target population containing all “units” which could possibly be used in the study.! Sampling protocol – procedure used to select the sample! Sample –the subset of “units” about which we actually collect info.Stat 13, UCLA, Ivo DinovSlide 21More terminology …! Census – attempt to sample the entire population! Parameter – numerical characteristic of the population, e.g., income, age, etc. Often we want to estimate population parameters.! Statistic – a numerical characteristic of the sample. (Sample) statisticis used to estimate a corresponding population parameter.! Why do we sample at random? We draw “units”
View Full Document