MIT 6 262 - Detection, decisions, and hypothesis testing - D3000195

Home> Schools> Massachusetts Institute of Technology> Electrical Engineering and Computer Science (6) > 6 262> Detection, decisions, and hypothesis testing

MIT 6 262 - Detection, decisions, and hypothesis testing

School name Massachusetts Institute of Technology

Course 6 262- Discrete Stochastic Processes

Pages 44

Download Save

Unformatted text preview:

Chapter 8Detection, decisions, andhypothesis testingDetection, decision making, and hypothesis testing are di↵erent names for the same proce-dure. The word detection refers to the e↵ort to decide whether some phenomenon is presentor not in a given situation. For example, a radar system attempts to detect whether or nota target is present; a quality control system attempts to detect whether a unit is defective;a medical test detects whether a given disease is present. The meaning has been extendedin the communication field to detect which one, among a finite set of mutually exclusivepossible transmited signals, has been transmitted. Decision making is, again, the processof choosing between a number of mutually exclusive alternatives. Hypothesis testing is thesame, except the mutually exclusive alternatives are called hypotheses. We usually usethe word hypotheses for these alternatives in what follows, since the word conjures up theappropriate intuitive image.These problems will usually be modeled by a generic type of probability model. Each suchmo del is characterized by a discrete random variable (rv) X called the hypothesis rv andanother rv or random vector Y called the observation rv. The sample values of X are calledhypotheses; it makes no di↵erence what these hypotheses are called, so we usually numberthem, 0, 1, . . . , m  1. When the experiment is performed, the resulting sample point !maps into a sample value x for X, x 2 {0, 1, m1}, and into a sample value y for Y. Thedecision maker observes y (but not x) and maps y into a decision ˆx(y). The decision iscorrect if ˆx(y) = x.The probability px= pX(x) of hypothesis x is referred to as the a priori probability ofhypothesis x. The probability model is completed by the conditional distribution of Y,conditional on each sample value of X. These conditional distributions are called likelihoodsin the terminology of hypothesis testing. In most of the situations we consider, theseconditional distributions are represented either by a PMF or by joint probability densitiesover Rnfor a given n  1. To avoid repeating everything, probability densities are usuallyassumed. Arguments that cannot be converted to discrete observations simply by replacingPDF’s with PMF’s and changing integrals to sums will be discussed as they arise. There arealso occasional comments about observations that cannot be described by PDF’s or PMF’s.377378 CHAPTER 8. DETECTION, DECISIONS, AND HYPOTHESIS TESTINGAs with any probability model representing a real-world phenomenon, the random variablesmight model quantities that are actually random, or quantities such as coin tosses that mightbe viewed as either random or deterministic, or quantities such as physical constants thatare deterministic but unknown. In addition, the model might be chosen for its simplicity,or for its similarity to some better-understood phenomenon, or for its faithfulness to someaspect of the real-world phenomenon.Classical statisticians1are uncomfortable with the use of completely probabilistic modelsto study hypothesis testing, particularly when the ‘correct’ hypothesis is not random inany very real sense, or not random with known probabilities. They have no objection toa separate probability model for the observation under each hypothesis, but are unwillingto choose a priori probabilities for the hypotheses. This is partly a practical matter, sincestatisticians design experiments to gather data and make decisions that often have con-siderable political and commercial importance. The use of a priori probabilities could beviewed as biasing these decisions and thus losing the appearance of impartiality.The approach in this text, as pointed out frequently before, is to use a variety of probabilitymo dels to gain insight and understanding about real-world phenomena. If we assume avariety of a priori probabilities and then see how the results depend on those choices, weoften learn more than if we refuse to consider a priori probabilities at all.2This is illustratedin the development of the Neyman-Pearson criterion in Section 8.4.Before discussing how to make decisions, it is important to understand when and whydecisions must be made. As an example, suppose we conclude, on the basis of an observation,that hypothesis 0 is correct with probability 2/3 and hypothesis 1 with probability 1/3.Simply making a decision on hypothesis 0 and forgetting about the probabilities throwsaway much of the information that has been gathered. The issue, however, is that sometimeschoices must be made. In a communication system, the recipient wants to receive themessage (perhaps with an occasional error) rather than a set of probabilities. In a controlsystem, the controls must occasionally take action. Similarly managers must occasionallychoose between courses of action, between products, and between people to hire. In a sense,it is by making decisions (and, in Chapter 10, by making estimates) that we return fromthe world of mathematical probability models to the world being modeled.8.1 Decision criteria and the MAP criterionThere are a number of possible criteria for making decisions, and initially we concentrateon maximizing the probability of making correct decisions. For each hypothesis x, let pxbe the a priori probability that X = x and let fY |X(y | x) be the joint probability density(called a likelihood) that Y = y conditional on X = x. If fY(y) > 0, then the probability1Statisticians have argued since the time of Bayes about the ‘validity’ of choosing a priori probabilitiesfor hypotheses to be tested. Bayesian statisticians are comfortable with this practice and non-Bayesian orclassical statisticians are not.2This should not be construed as a criticism of classical statistics, where the constraints of practice oftendictate avoiding both a priori probabilities and the attendant theoretical understanding.8.1. DECISION CRITERIA AND THE MAP CRITERION 379that X = x, conditional on Y = y , is given by Bayes’ law aspX|Y(x | y) =pxfY |X(y | x)fY(y)where fY(y) =m1Xx=0pxfY |X(y | x). (8.1)Whether Y is discrete or continuous, the set of sample values where fY(y) = 0 or pY(y) = 0is an event of zero probability. Thus we ignore this event in what follows and simply assumethat fY(y) > 0 or pY(y) > 0 for all sample values.The decision maker observes y and must choose ˆx(y) from the set of hypotheses. Theprobability that hypothesis x is correct (i.e., the probability that X = x)

View Full Document


School:
Email:
New Password:
Confirm Password:

MIT 6 262 - Detection, decisions, and hypothesis testing

Sign up for free to view:

Please select your school