38 4 SCALAR, STOCHASTIC, DISCRETE DYNAMIC SYSTEMSDiscrete ProbabilitiesConsider a set Ω of the possible individual outcomes of an experiment. In the roll of a die, Ωcould be the set of six possible values. In the roll of three dice, Ω could be the set of 63= 216combinations of values. In the sand-hill crane experiment, Ω could be the set of possible populationsizes in year 7, or even the set of all possible sequences y(n) of populations over 20 years, onesequence forming a single element ω in Ω. The set Ω is called the universe, because it considersall conceivable outcomes of interest.The set E of events built on Ω is the set of all possible subsets of elements taken from Ω. Forthe roll of a die, the event set E is the following set of 26= 64 subsets:E = {∅, {1}, {2}, {3}, {4}, {5}, {6}, {1, 2}, {1, 3}, {1, 4}, {1, 5}, {1, 6},{2, 3}, {2, 4}, {2, 5}, {2, 6}, {3, 4}, {3, 5}, {3, 6}, {4, 5}, {4, 6}, {5, 6},{1, 2, 3}, {1, 2, 4}, {1, 2, 5}, {1, 2, 6}, {1, 3, 4}, {1, 3, 5}, {1, 3, 6}, {1, 4, 5},{1, 4, 6}, {1, 5, 6}, {2, 3, 4}, {2, 3, 5}, {2, 3, 6}, {2, 4, 5}, {2, 4, 6}, {2, 5, 6},{3, 4, 5}, {3, 4, 6}, {3, 5, 6}, {4, 5, 6}, {1, 2, 3, 4}, {1, 2, 3, 5}, {1, 2, 3, 6},{1, 2, 4, 5}, {1, 2, 4, 6}, {1, 2, 5, 6}, {1, 3, 4, 5}, {1, 3, 4, 6}, {1, 3, 5, 6}, {1, 4, 5, 6},{2, 3, 4, 5}, {2, 3, 4, 6}, {2, 3, 5, 6}, {2, 4, 5, 6}, {3, 4, 5, 6}, {1, 2, 3, 4, 5},{1, 2, 3, 4, 6}, {1, 2, 3, 5, 6}, {1, 2, 4, 5, 6}, {1, 3, 4, 5, 6}, {2, 3, 4, 5, 6}, {1, 2, 3, 4, 5, 6}}Conceptually, it is important to note that, say, {1, 4} above is really a shorthand for{“the die has produced value 1”, “the die has produced value 4”} .The first event, ∅, is the empty set (which is a subset of any set), and the next six events arecalled singletons because they comprise a single outcome each. The largest event is Ω itself (lastin the list above). Note that events are not repetitions of outcomes. For instance, the event {1, 4, 6}does not mean “first 1, then 4, then 6,” but rather “either 1 or 4 or 6.”The Probability FunctionA (discrete) probability function is a function P from the event set E to the real numbersthat satisfies the following properties:P1: P (E) ≥ 0 for every E ∈ EP2: P (Ω) = 1P3: E ∩F = ∅ → P (E ∪ F ) = P (E) + P (F ) for all E, F ∈ E .A probability function can be viewed as a measure for sets in the event space E, normalizedso that the universe Ω ∈ E has unit measure (property P2).39Property P1states that measures are nonnegative, and property P3reflects additivity: if wemeasure two separate (disjoint) sets, their sizes add up. For instance, the event E = {2, 4, 6} hasmeasure 1/2 with the probabilities defined for a die roll. The event E is greater than the eventF = {1, 3}, which has only measure 1/3. Since E and F are disjoint, their union E ∪ F hasmeasure 1/2 + 1/3 = 5/6.The event set is large, and the properties above imply that probabilities cannot be assignedarbitrarily to events. For instance, the empty set must be given probability zero. Since∅ ∩ E = ∅ for any E ∈ E ,property P3requires thatP (∅ ∪ E) = P (∅) + P (E) .However,∅ ∪ E = E ,soP (E) = P (∅) + P (E) ,that is,P (∅) = 0 .Independence and Conditional ProbabilityTwo events E, F in E are said to be mutually independent ifP (E ∩F ) = P (E) P (F ) .For instance, the events E = {1, 2, 3} and F = {3, 5} in the die roll are mutually independent:P (E) = 1/2 , P (F ) = 1/3 and P (E ∩ F ) = P ({3}) = 1/6so thatP (E ∩F ) = P (E)P (F ) = 1/6 .There is little intuition behind this definition of independence. To understand its importance, weintroduce the notion of conditional probability:Let F be an event of nonzero probability. Then, the conditional probability of an event Egiven F is defined as follows:P (E | F ) =P (E ∩F )P (F ). (15)40 4 SCALAR, STOCHASTIC, DISCRETE DYNAMIC SYSTEMSNote first that independence of E and F (assuming P (F ) > 0) is equivalent toP (E | F ) =P (E ∩F )P (F )=P (E)P (F )P (F )= P (E) ,that is,Any two events E, F in the event set E are mutually independent ifP (E ∩F ) = P (E) P (F ) .If P (F ) > 0, the two events E and F are mutually independent if and only ifP (E | F ) = P (E) . (16)Both the notion of conditional probability and that of independence can be given a useful,intuitive meaning. Because of normalization (P(Ω) = 1), the probability of an event E is thefraction of universe Ω covered by E, as measured by P (·) (see the Venn diagram in Figure 8(a)).From the definition (15) we see that the conditional probability P(E | F ) measures the fraction ofthe area of F covered by the intersection E ∩ F (Figure 8 (b)). Thus, conditioning by F redefinesthe universe to be F , and excludes from consideration the part of event E (or of any other event)that is outside the new universe. In other words, P (E | F ) is the probability of the part of event Ethat is consistent with F , and re-normalized to the measure of F : Given that we know that F hasoccurred, what is the new probability of E?ΩEΩE FΩEΩE F(a) (b)Figure 8: (a) The probability of an event E can be visualized as the fraction of the unit area ofuniverse Ω that is covered by E. (b) The conditional probability of E given F redefines the newuniverse as F (both shaded areas), and only considers the area of the part of E in the new universe,that is, of E ∩F (darker shading).For example, suppose that we are interested in the probability of the event E = {4, 5, 6} (eithera 4, a 5, or a 6 is produced) in a single roll of a die. The (unconditional) probability of E is 1/2.41We are subsequently told that the last roll has produced an odd number, so event F = {1, 3, 5}has occurred (we have not seen the roll, so we do not know which odd number has come out). Theconditional probability P(E | F ) measures the probability of E given that we know that event Fhas occurred. The two outcomes 4 and 6 in E are now inconsistent with F , and E ∩F = {5} liststhe only remaining possibility in favor of E. The new …
View Full Document