CS70 Fall 2003 UC Berkeley Discussion 11 Amir Kamil 11 21 03 Topics Expectation Variance Life Insurance1 1 As an extended example of probability we analyze a simple life insurance system A real system would be too cumbersome to look at so we make many simplifications here Here are the basic rules for our system 1 You pay b dollars to the insurance company when you are born You never have to pay again 2 If you die before age c the company pays your beneficiaries d dollars 3 The insurance company is non profit so just wants to break even Given these rules what should the insurance company set as the values of b and d in terms of c Let X be the age at which a person dies The fraction of its customers the insurance pays is then the fraction of those that die before age c or Pr X c Then b and d are related by b d Pr X c Let s do a detailed example where c 60 and d 1 000 000 We need to compute Pr X 60 1 1 Distribution of Death Before we can calculate Pr X 60 we need to know what the distribution of X looks like First let s assume that nobody lives past 100 Now we can t just take the distribution to be uniform in the range 1 100 since a person is more likely to dies as they get older So let s assume a linear distribution Pr X k k N for k 1 100 We calculate the constant N in order to ensure the probabilities sum to 1 100 100 X X Pr X k i N i 1 i 1 1 N X i 1100 i 1 N 5050 1 so N 5050 1 2 Life Expectancy The first thing we should calculate is the expected age at which a person dies We have E X 100 X i Pr X i i 1 100 X i i N i 1 1 This section is so blatantly ripped off of Felix Wu s notes that I have to give him credit here 1 1 N 100 X i2 i 1 1 N 100 100 1 2 100 1 6 67 where we used the identity n X i2 i 1 n n 1 2n 1 6 in the fourth line Knowing just the expectation is not enough to calculate Pr X 60 Consider the two distributions A where Pr X 67 1 and B where Pr X 55 Pr X 79 0 5 In A Pr X 60 0 whereas in B Pr X 60 0 5 The variance is what makes the difference in the above distributions It is variance that makes insurance useful If there were no variance everyone would know when they would die and thus no one would need life insurance 1 3 Variance and Chebyshev s Inequality We proceed by calculating the variance of the age at which a person dies We have Var X E X 2 E X 2 so we need to first calculate E X 2 100 X E X 2 i2 Pr X 2 i2 i 1 100 X i2 Pr X i i 1 100 X i2 i N i 1 1 N 100 X i3 i 1 1 N N 2 5050 where in the fifth line we used the identity n X i3 i 1 n X i 2 i 1 Then Var X 5050 672 561 2 Now recall Chebyshev s inequality Pr X E X r Var X r2 We want to calculate Pr X 60 Pr E X X 7 So in order to use Chebyshev s we need to plug in r 7 Note that this also allows X 74 but we can t do any better with Chebyshev s So we have Pr X 60 Pr 67 X 7 Pr X 67 7 Var X 72 561 49 11 45 But notice a problem here Probabilities are always at most 1 Chebyshev s tells us that Pr X 60 11 45 which is less than we already knew Notice that in order for Chebyshev s to give us a bound less than 1 we must have r X so that 2 Var X x 1 Thus the inequality gives us no information when we are looking within a standard r2 r2 deviation from the mean Even in general Chebyshev s still gives us a weak bound It s usefulness is due to the fact that it is easy to compute and only requires knowledge of the expectation and variance of a random variable 1 4 Exact Solution In this case since the distribution is so simple we can compute Pr X 60 directly We have Pr X 60 59 X Pr X i i 1 59 X i N i 1 1 N 1770 0 35 Thus the insurance company should set b 0 35 1 000 000 350 000 quite a large sum of money 2 The Florida Debacle Recall the 2000 presidential election At the center of the scandal were the infamous butterfly ballots of Palm Beach County Many people claimed that the format of these ballots resulted in many votes intended for Al Gore to go to Pat Buchanan Here we will analyze the statistical significance of the number of votes Buchanan received in that county The percentages of votes cast for each of the candidates in the entire state of Florida were as follows Gore 48 8 Bush 48 9 Buchanan 0 3 Nader 1 6 Browne 0 3 Others 0 1 In Palm Beach County the actual votes cast before the recounts began were as follows 3 Gore 268945 Bush 152846 Buchanan 3407 Nader 5564 Browne 743 Others 781 Total 432286 To model this situation probabilistically we need to make some assumptions Let s model the vote cast by each voter in Palm Beach County as a random variable Xi where Xi takes on each of the six possible values five candidates or Others with probabilities corresponding to the Florida percentages Thus e g Pr Xi Gore 0 488 There are a total of n 432286 voters and their votes are assumed to be mutually independent Let the r v B denote the total votes cast for Buchanan in Palm Beach County i e the number of voters i for which Xi Buchanan We first compute the expectation and variance of B Let Bi be a random variable representing whether the ith person voted for Buchanan i e Bi 1 if and only if Xi Buchanan Note that the Bi s are independently and identically distributed with E Bi 0 003 and Var Bi P 0 003 1 0 003 0 002991 n Moreover by linearity of expectation and independence we find that E B i 1 E Bi 432286 0 003 Pn 1297 and Var B i 1 Var Bi 432286 0 002991 1293 Now we use Chebyshev s inequality to compute an upper bound b on the probability that Buchanan receives at least 3407 votes so that Pr B 3407 b Chebyshev s inequality promises …
View Full Document