DOC PREVIEW
Berkeley COMPSCI 70 - n16

This preview shows page 1-2-3 out of 9 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

EECS 70 Discrete Mathematics and Probability TheorySpring 2014 Anant Sahai Lecture 16VarianceQuestion: Let us return once again to the question of how many heads in a typical sequence of n coin flips.Recall that we used the Galton board to visualize this as follows: consider a token falling through the Galtonboard, deflected randomly to the right or left at each step. We think of these random deflections as takingplace according to whether a fair coin comes up heads or tails. The expected position of the token when itreaches the bottom is right in the middle, but how far from the middle should we typically expect it to land?Denoting a right-move by +1 and a left-move by −1, we can describe the probability space here as the setof all words of length n over the alphabet {±1}, each having equal probability12n. Let the r.v. X denote ourposition (relative to our starting point 0) after n moves. ThusX = X1+ X2+ ···+ Xn,where Xi=(+1 if ith toss is Heads;−1 otherwise.Now obviously we have E(X) = 0. The easiest way to see this is to note that E(Xi) = (12×1)+(12×(−1)) =0, so by linearity of expectation E(X) =∑ni=1E(Xi) = 0. But of course this is not very informative, and isdue to the fact that positive and negative deviations from 0 cancel out.What the above question is really asking is: What is the expected value of |X|, the distance of the distancefrom 0? Rather than consider the r.v. |X|, which is a little awkward due to the absolute value operator, wewill instead look at the r.v. X2. Notice that this also has the effect of making all deviations from 0 positive,so it should also give a good measure of the distance traveled. However, because it is the squared distance,we will need to take a square root at the end to make the units make sense.Let’s calculate E(X2):E(X2) = E((X1+ X2+ ···+ Xn)2)= E(∑ni=1X2i+∑i6= jXiXj)=∑ni=1E(X2i) +∑i6= jE(XiXj)In the last line here, we used linearity of expectation. To proceed, we need to compute E(X2i) and E(XiXj)(for i 6= j). Let’s consider first X2i. Since Xican take on only values ±1, clearly X2i= 1 always, so E(X2i) = 1.What about E(XiXj)? Well, XiXj= +1 when Xi= Xj= +1 or Xi= Xj= −1, and otherwise XiXj= −1.Also,Pr[(Xi= Xj= +1) ∨(Xi= Xj= −1)] = Pr[Xi= Xj= +1] + Pr[Xi= Xj= −1] =14+14=12,so XiXj= 1 with probability12. In the above calculation we used the fact that the events Xi= +1 andXj= +1 are independent, so Pr[Xi= Xj= +1] = Pr[Xi= +1] ×Pr[Xj= +1] =12×12=14(and similarly forPr[Xi= Xj= −1]). Therefore XiXj= −1 with probability12also. Hence E(XiXj) = 0. Since Xiand Xjareindependent, we saw in the last lecture note that it is the case that E(XiXj) = E(Xi)E(Xj) = 0.EECS 70, Spring 2014, Lecture 16 1Plugging these values into the above equation givesE(X2) = (n ×1) + 0 = n.So we see that our expected squared distance from 0 is n. One interpretation of this is that we might expectto be a distance of about√n away from 0 after n steps. However, we have to be careful here: we cannotsimply argue that E(|X|) =pE(X2) =√n. (Why not?) We will see later in the lecture how to make precisedeductions about |X| from knowledge of E(X2).For the moment, however, let’s agree to view E(X2) as an intuitive measure of “spread” of the r.v. X. Infact, for a more general r.v. with expectation E(X) = µ, what we are really interested in is E((X −µ)2), theexpected squared distance from the mean. In our random walk example, we had µ = 0, so E((X −µ)2) justreduces to E(X2).Definition 16.1 (variance): For a r.v. X with expectation E(X) = µ, the variance of X is defined to beVar(X) = E((X −µ)2).The square root σ(X) :=pVar(X) is called the standard deviation of X.The point of the standard deviation is merely to “undo” the squaring in the variance. Thus the standarddeviation is “on the same scale as” the r.v. itself. Since the variance and standard deviation differ just by asquare, it really doesn’t matter which one we choose to work with as we can always compute one from theother immediately. We shall usually use the variance. For the random walk example above, we have thatVar(X) = n, and the standard deviation of X, σ(X ), is√n.The following easy observation gives us a slightly different way to compute the variance that is simpler inmany cases.Theorem 16.1: For a r.v. X with expectation E(X) = µ, we have Var(X) = E(X2) −µ2.Proof: From the definition of variance, we haveVar(X) = E((X −µ)2) = E(X2−2µX + µ2) = E(X2) −2µE(X) + µ2= E(X2) −µ2.In the third step here, we used linearity of expectation. 2ExamplesLet’s see some examples of variance calculations.1. Fair die. Let X be the score on the roll of a single fair die. Recall from an earlier lecture that E(X) =72.So we just need to compute E(X2), which is a routine calculation:E(X2) =1612+ 22+ 32+ 42+ 52+ 62=916.Thus from Theorem 16.1Var(X) = E(X2) −(E(X))2=916−494=3512.More generally, if X is a random variable that takes on values 1,...,n with equal probability 1/n (i.e.X has a uniform distribution), the mean, variance and standard deviation of X are:EECS 70, Spring 2014, Lecture 16 2E(X) =n + 12, Var(X) =n2−112, σ(X) =rn2−112.(You should verify these.)2. Number of fixed points. Let X be the number of fixed points in a random permutation of n items (i.e.,the number of students in a class of size n who receive their own homework after shuffling). We sawin an earlier lecture that E(X) = 1 (regardless of n). To compute E(X2), write X = X1+ X2+ ···+ Xn,where Xi=(1 if i is a fixed point;0 otherwiseThen as usual we haveE(X2) =n∑i=1E(X2i) +∑i6= jE(XiXj). (1)Since Xiis an indicator r.v., we have that E(X2i) = Pr[Xi= 1] =1n. Since both Xiand Xjare indicators,we can compute E(XiXj) as follows:E(XiXj) = Pr[Xi= 1 ∧Xj= 1] = Pr[both i and j are fixed points] =1n(n −1).[Check that you understand the last step here.] Plugging this into equation (1) we getE(X2) = (n ×1n) + (n(n −1) ×1n(n−1)) = 1 + 1 = 2.Thus Var(X ) = E(X2) −(E(X))2= 2 −1 = 1. I.e., the variance and the mean are both equal to 1.Like the mean, the variance is also independent of n. Intuitively at least, this means that it is unlikelythat there will be more than a small number of fixed points even when the number of items, n, is verylarge.Variance for sums of independent random variablesOne of the most important and useful facts about variance is if a random variable X is the sum of independentrandom variables X = X1+ ···Xn, then its variance is the


View Full Document

Berkeley COMPSCI 70 - n16

Documents in this Course
Notes 1

Notes 1

12 pages

Note 2

Note 2

6 pages

Notes

Notes

6 pages

Notes

Notes

7 pages

Note 10

Note 10

8 pages

n20

n20

10 pages

n19

n19

10 pages

n18

n18

10 pages

n17

n17

6 pages

n15

n15

10 pages

n14

n14

9 pages

n13

n13

6 pages

n12

n12

7 pages

n11

n11

6 pages

n10

n10

8 pages

n9

n9

5 pages

n8

n8

7 pages

n7

n7

8 pages

n6

n6

5 pages

n5

n5

8 pages

n4

n4

7 pages

n3

n3

10 pages

n2

n2

7 pages

n1

n1

5 pages

Load more
Download n16
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view n16 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view n16 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?