March 18 2003 1 3 Bayes decision theory The distinguishing feature of Bayesian statistics is that a probability distribution called a prior is given on the parameter space T Sometimes priors are also considered which may be in nite such as Lebesgue measure on the whole real line but such priors will not be treated here at least for the time being A Bayesian statistician chooses a prior based on whatever information on the unknown is available in advance of making any observations in the current experiment In general no de nite rules are prescribed for choosing Priors are often useful as technical tools in reaching non Bayesian conclusions such as admissibility in Theorems 1 2 5 and 1 2 6 Bayes decision rules were de ned near the end of the last section as rules which minimize the Bayes risk and for which the risk is nite Bayes tests of P vs Q treated in Theorem 1 1 8 are a special case of Bayes decision rules We saw in that case that Bayes rules need not be randomized Remark 1 1 9 The same is true quite generally in Bayes decision theory if in a given situation it is Bayes to choose at random among two or more possible decisions then the decisions must have equal risks conditional on the observations and we may as well just take one of them Theorem 1 3 1 will give a more precise statement In game theory randomization is needed to have a strategy that is optimal even if the opponent knows it and can choose a strategy accordingly If one knows the opponent s strategy then it is not necessary to randomize Sometimes statistical decision theory is viewed as a game against an opponent called Nature Unlike an opponent in game theory Nature is viewed as neutral not trying to win the game Assuming a prior as in Bayes decision theory is to assume in e ect that Nature follows a certain strategy In showing that randomization isn t needed it will be helpful to formulate randomization in a fuller way where we not only choose a probability distribution over the possible actions but then also choose an action according to that distribution in a measurable way as follows De nition A randomized decision rule d X DE is realizable if there is a probability space F and a jointly measurable function X A such that for each x in X x has distribution d x in other words d x is the image measure of by x d x x 1 For example a randomized test as in Sec 1 1 is always a realizable rule where we can take as the interval 0 1 with Lebesgue measure and let x t dQ if t f x and dP otherwise It is shown in the next section that decision rules are realizable under conditions wide enough to cover a great many cases for example whenever the action space is a subset of a space Rk with Borel algebra It will be shown next that randomization is unnecessary for realizable Bayes rules The idea is that the Bayes risk of a realizable randomized Bayes rule d is an average of Bayes risks of non randomized rules Since a Bayes rule has minimum Bayes risk the risks of are no smaller so they must almost all be equal to that of d Then such non randomized for xed are Bayes rules 1 1 3 1 Theorem For any decision problem for a measurable family P and prior if there is a realizable Bayes randomized decision rule d then there is a non randomized Bayes decision rule Proof First here is a helpful technical fact 1 3 2 Lemma For any measurable family P and nonnegative jointly measurable function f x f x the function g de ned by g f x dP x is jointly measurable Proof If f x 1T 1B x 1F for some T T B B and F F then g P B 1T 1F is measurable in since P B is measurable by assumption The rest of the proof of the Lemma is like that of Prop 1 2 4 Now to prove Theorem 1 3 1 take F and as in the de nition that d is realizable For each xed is a non randomized decision rule So r r d since d is Bayes for Also writing da d a for a measure r d r d d r d x dP x d by the de nitions L a d x da dP x d L x d dP x d by the image measure theorem e g RAP 4 1 11 So by the Tonelli Fubini theorem for nonnegative measurable functions twice and the measurability shown in Lemma 1 3 2 we get r d L x dP x d d r d Thus r r d for almost all and so for some providing a Bayes non randomized decision rule If every randomized rule is realizable as is shown in the next section under conditions given there then Theorem 1 3 1 shows that the non randomized rules form an essentially complete class as de ned in Sec 1 2 It will also be shown in Sec 2 2 below that nonrandomized rules are essentially complete under some other conditions De nition A family P of laws on a measurable space X B will be called dominated if for some nite measure v each law P is absolutely continuous with respect to v in other words for any A B v A 0 implies P A 0 for all Often v would be Lebesgue measure on Rk or if the measures were all concentrated on a countable set such as the integers v would be counting measure the measure giving mass 1 to each point on the set 2 If P is absolutely continuous with respect to v then by the Radon Nikodym theorem RAP 5 5 4 it has a density or Radon Nikodym derivative f x dP dv x A algebra B is called countably generated if there is a countable subcollection C B such that B is the smallest algebra including C In any separable metric space the Borel algebra is countably generated taking C as the set of balls with rational radii and centers in a countable dense set In the great majority of applications of statistics sample spaces are separable metric spaces in fact Euclidean spaces Rk At any rate from here on it will be assumed that B is countably generated unless something to the contrary is stated 1 3 3 Theorem If P is a dominated measurable family on a sample space X B for a parameter space T and a nite measure v then the density function f x dP dv x can be taken to be jointly measurable in and x Proof Let Br r 1 2 be an increasing sequence of nite Boolean algebras of subsets of X whose union generates B Such algebras exist by the blanket assumption that B is countably generated There is a probability measure Q equivalent …
View Full Document