Chapter 10 Populations Getting Started You have now completed Part 1 of these notes consisting of nine chapters What have you learned On the one hand you could say that you have learned many things about the discipline of Statistics I am quite sure that you have expended a great deal of time and effort to learn perhaps master the material in the first nine chapters On the other hand however you could say I have learned more than I ever wanted to know about the Skeptic s Argument and not much else I hope that you feel differently but I cannot say this comment is totally lacking in merit So why have we spent so much time on the Skeptic s Argument First because the idea of Occam s Razor is very important in science It is important to be skeptical and not just jump on the bandwagon of the newest idea For data based conclusions we should give the benefit of the doubt to the notion that nothing is happening and only conclude that indeed something is happening if the data tell us that the nothing is happening hypothesis is inadequate The Skeptic s Argument is in my opinion the purest way to introduce you to how to use Statistics in science The analyses you have learned in the first nine chapters require you to make decisions the choice of the components of a CRD the choice of the alternative for a test of hypotheses for numerical data the choice of test statistic for a power study the choice of an alternative of interest The analyses require you to take an action you must randomize But and this is the key point the analyses make no assumptions The remainder of these notes will focus on population based inference Assumptions are always necessary in order to reach a conclusion on a population based inference The two most basic of these assumptions involve 1 How do the units actually studied relate to the entire population of units 2 What structure is assumed for the population By the way if either or both of these questions makes no sense to you that is fine We will learn about these questions and more later in these notes As we will see in population based inference we never some might say rarely I don t want to quibble about this know with certainty whether our assumptions are true Indeed we usually know that they are not true in this situation we spend time investigating how much it matters that our assumptions are not true In my experience the reason why many certainly not all perhaps not even most math teachers have so much trouble teaching Statistics is because they just don t 215 get the idea that an assumption can be wrong If a mathematician says Assume we have a triangle or a rectangle or a continuous function and I say How do you know the assumption is true the mathematician will look at me and say Bob you are hopeless The above discussion raises an obvious question If population based inference techniques rely on assumptions that are not true why learn them Why not limit ourselves to studies for which we can examine the Skeptic s Argument Well as much as I love the Skeptic s Argument I must acknowledge its fundamental weakness It is concerned only with the units in the study it has no opinion on the units that are not in the study Here is an example of what I mean Suppose that a balanced CRD is performed on n 200 persons suffering from colon cancer There are two competing treatments 1 and 2 and the data give a P value of 0 0100 for the alternative 6 with the data supporting the notion that treatment 1 is better The Skeptic s Argument is literally concerned only with the n 200 persons in the study The Skeptic s Argument makes no claim as to how the treatments would work on any of the thousands of people with colon cancer who are not in the study If you are a physician caring for one of these thousands you will need to decide which treatment you recommend The Skeptic cannot tell you what to do By contrast with population based inference a P value equal to 0 0100 allows one to conclude that overall treatment 1 is better than treatment 2 for the entire population By making more assumptions population based inference obtains a stronger conclusion The difficulty of course is that the assumptions of the population based inference might not be true and if not true might give a misleading conclusion Of course there is another difficulty in my colon cancer example As we saw in Case 3 in Table 5 3 on page 90 in Chapter 5 even if we conclude that treatment 1 is better than treatment 2 overall this does not imply that treatment 1 is better than treatment 2 for every subject this is true for the Skeptic s argument and it s true for population based inference There is of course a second weakness of the methods we covered in Part 1 of these notes They require the assignment of units to study factor levels by randomization For many studies in science randomization is either impossible or if possible highly unethical For an example of the former consider any study that compares the responses given by men and women For an example of the latter imagine a study that assigns persons by randomization to the smokes three packs of cigarettes per day treatment As we will discuss often in the remainder of these notes studies with randomization yield greater scientific validity in a carefully explained way than studies without randomization This does not mean however that studies without randomization are inherently bad or are to be avoided One of the greatest strengths of population based inference is that it allows a scientist to make predictions about future uncertain outcomes The Skeptic s Argument cannot be made to do this Predictions are important in real life and they give us a real world measure of whether the answers we get from a statistical analysis have any validity Anyways I have gotten very far ahead of myself Thus don t worry if much of the above is confusing By the end of these notes these issues will make sense to you In the next section we will begin a long and careful development of various ideas and methods of population based inference 216 10 1 The Population Box In Chapter 1 we learned that there are two types of units in a study trials and subjects When the units are subjects often the subjects are different people The subjects could be anything from different automobiles to different aardvarks but in my experience my students are more comfortable with examples that have subjects that are people Therefore most of my examples of units as subjects will have the subjects be people
View Full Document