Artificial Intelligence 15-381Mar 22, 2007Probability and Uncertainty 2:Probabilistic ReasoningMichael S. Lewicki ! Carnegie MellonAI: Probabilistic Inference 22Review of concepts from last lectureMaking rational decisions when faced with uncertainty:•Probabilitythe precise representation of knowledge and uncertainty•Probability theoryhow to optimally update your knowledge based on new information•Decision theory: probability theory + utility theoryhow to use this information to achieve maximum expected utilityBasic concepts•random variables•probability distributions (discrete) and probability densities (continuous)•rules of probability•expectation and the computation of 1st and 2nd moments•joint and multivariate probability distributions and densities•covariance and principal componentsMichael S. Lewicki ! Carnegie MellonAI: Probabilistic Inference 2Simple example: medical test results•Test report for rare disease is positive, 90% accurate•What’s the probability that you have the disease?•What if the test is repeated?•This is the simplest example of reasoning by combining sources of information.3Michael S. Lewicki ! Carnegie MellonAI: Probabilistic Inference 2How do we model the problem?•Which is the correct description of “Test is 90% accurate” ?•What do we want to know?•More compact notation:4P (T = true|D = true) → P (T |D)P (T = false|D = false) → P (¯T |¯D)P (T = true) = 0.9P (T = true|D = true) = 0.9P (D = true|T = true) = 0.9P (T = true)P (T = true|D = true)P (D = true|T = true)Michael S. Lewicki ! Carnegie MellonAI: Probabilistic Inference 2Evaluating the posterior probability through Bayesian inference•We want P(D|T) = “The probability of the having the disease given a positive test”•Use Bayes rule to relate it to what we know: P(T|D)•What’s the prior P(D)?•Disease is rare, so let’s assume•What about P(T)?•What’s the interpretation of that?5P (D |T ) =P (T |D)P (D)P (T )posteriorlikelihood priornormalizing constantP (D ) = 0.001Michael S. Lewicki ! Carnegie MellonAI: Probabilistic Inference 2Evaluating the normalizing constant•P(T) is the marginal probability of P(T,D) = P(T|D) P(D)•So, compute with summation•For true or false propositions:6P (D |T ) =P (T |D)P (D)P (T )posteriorlikelihood priornormalizing constantP (T ) =!all values of DP (T |D)P (D)P (T ) = P (T |D)P (D) + P (T |¯D)P (¯D)What are these?Michael S. Lewicki ! Carnegie MellonAI: Probabilistic Inference 2Refining our model of the test•We also have to consider the negative case to incorporate all information:•What should it be?•What about ?7P (T |D) = 0.9P (T |¯D) = ?P (¯D)Michael S. Lewicki ! Carnegie MellonAI: Probabilistic Inference 2Plugging in the numbers•Our complete expression is•Plugging in the numbers we get:•Does this make intuitive sense?8P (D |T ) =P (T |D)P (D)P (T |D)P (D) + P (T |¯D)P (¯D)P (D |T ) =0.9 × 0.0010.9 × 0.001 + 0.1 × 0.999= 0.0089Michael S. Lewicki ! Carnegie MellonAI: Probabilistic Inference 2Same problem different situation•Suppose we have a test to determine if you won the lottery.•It’s 90% accurate.•What is P($ = true | T = true) then?9Michael S. Lewicki ! Carnegie MellonAI: Probabilistic Inference 2Playing around with the numbers•What if the test were 100% reliable?•What if the test was the same, but disease wasn’t so rare?10P (D |T ) =1.0 × 0.0011.0 × 0.001 + 0.0 × 0.999= 1.0P (D |T ) =0.9 × 0.10.9 × 0.1 + 0.1 × 0.999= 0.5P (D |T ) =P (T |D)P (D)P (T |D)P (D) + P (T |¯D)P (¯D)Michael S. Lewicki ! Carnegie MellonAI: Probabilistic Inference 2Repeating the test•We can relax, P(D|T) = 0.0089, right?•Just to be sure the doctor recommends repeating the test.•How do we represent this?•Again, we apply Bayes’ rule•How do we model P(T1,T2|D)?11P (D |T1, T2)P (D |T1, T2) =P (T1, T2|D)P (D)P (T1, T2)Michael S. Lewicki ! Carnegie MellonAI: Probabilistic Inference 2Modeling repeated tests•Easiest is to assume the tests are independent.•This also implies:•Plugging these in, we have12P (T1, T2|D) = P (T 1|D)P (T2|D)P (D |T1, T2) =P (T1, T2|D)P (D)P (T1, T2)P (T1, T2) = P (T 1)P (T2)P (D |T1, T2) =P (T1|D)P (T2|D)P (D)P (T1)P (T2)Michael S. Lewicki ! Carnegie MellonAI: Probabilistic Inference 2Evaluating the normalizing constant again•Expanding as before we have•Plugging in the numbers gives us•Another way to think about this:-What’s the chance of 1 false positive from the test?-What’s the chance of 2 false positives?•The chance of 2 false positives is still 10x more likely than the a prior probability of having the disease.13P (D |T1, T2) =P (T1|D)P (T2|D)P (D)!D={t, f }P (T1|D)P (T2|D)P (D)P (D |T ) =0.9 × 0.9 × 0.0010.9 × 0.9 × 0.001 + 0.1 × 0.1 × 0.999= 0.075Michael S. Lewicki ! Carnegie MellonAI: Probabilistic Inference 2Simpler: Combining information the Bayesian way•Let’s look at the equation again:•If we rearrange slightly:•It’s the posterior for the first test, which we just computed14P (D |T1, T2) =P (T1|D)P (T2|D)P (D)P (T1)P (T2)P (D |T1, T2) =P (T2|D)P (T1|D)P (D)P (T2)P (T1)We’ve seen this before!P (D |T1) =P (T1|D)P (D)P (T1)Michael S. Lewicki ! Carnegie MellonAI: Probabilistic Inference 2The old posterior is the new prior•We can just plugin the value of the old posterior•It plays exactly the same role as our old prior•Plugging in the numbers gives the same answer:15P (D |T1, T2) =P (T2|D)P (T1|D)P (D)P (T2)P (T1)P (D |T1, T 2) =P (T2|D) × 0.0089P (T2)P (D |T ) =P (T |D)P!(D )P (T |D)P!(D ) + P (T |¯D)P!(¯D)P (D |T ) =0.9 × 0.00890.9 × 0.0089 + 0.1 × 0.9911= 0.075This is how Bayesian reasoning combines old information with new information to update our belief states.Michael S. Lewicki ! Carnegie MellonAI: Probabilistic Inference 2Bayesian inference for distributions•The simplest case is true or false propositions•The basic computations are the same for distributions16Michael S. Lewicki ! Carnegie MellonAI: Probabilistic Inference 2An example with distributions: coin flipping•In Bernoulli trials, each sample is either 1 (e.g. heads) with probability ", or 0 (tails) with probability 1 # ".•The binomial distribution specifies the probability of the total # of heads, y, out of n trials: 17p(y|θ, n) =!ny"θy(1 − θ)n−y1 2 3 4 5 6 7 8 9 10 11 12 13 14 1500.050.10.150.20.25yp(y|!=0.5, n=10)Michael S. Lewicki ! Carnegie MellonAI: Probabilistic Inference 2The binomial distribution•In
View Full Document