UF STA 6166 - Probability Models - D75852

Home> Schools> University of Florida-Gainesville> (STA) > STA 6166> Probability Models

DOC PREVIEW

UF STA 6166 - Probability Models

School name University of Florida-Gainesville

Course Sta 6166- Statistical Methods in Research I

Pages 9

This preview shows page 1-2-3 out of 9 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 9 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 9 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 9 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 9 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Binomial ModelDeriving the mean and standard deviation of a binomial randoThe Normal modelThe Probability Distribution Function for the Normal ModelChapter 4 Probability Models Models are not exact representations of reality. A good model is a useful approximation. Why use a model? Why not use the actual distribution? 1. A model is compact. Saying that SAT scores are approximately N(500,100) is much more compact than giving the entire set of SAT scores. 2. A model is easier to work with than the raw data and can sometimes lead to simple conclusions. 3. Occasionally, there are theoretical reasons why a model should be the correct one. However, you have to be very careful of what theory says should be and what actually is. For now, we’ll use the normal model only when it appears to be justified from the data. There are several models for random phenomena that are so common that they have names. These will be discussed here. Bernoulli Model: A Bernoulli trial is a random phenomenon where there are only two possible outcomes (generically called “success” and “failure”). Independent Bernoulli trials are independent repetitions of the random phenomenon where the probability of success (called p) stays the same over all the trials. This is the model used for coin tosses. It can be used as a model for the outcome of many betting games where the two outcomes are “win” and “lose.” It can be used for many random phenomena where there are more than two possible outcomes, but where we are only concerned with whether some particular event happens or not. For example, roll a die and see whether or not you get an even number. Randomly choose an adult and determine if they are currently married or not, or intend to vote for Bush or not, or whether their annual income is above $30,000 or not. Random sampling from a population and observing a binary response variable is not precisely like Bernoulli trials. Even though the draws are independent, the probability of a “success” changes because random sampling is drawing without replacement. If I draw 10 cards without replacement from a deck of cards and observe whether or not each card is an Ace, these are not independent Bernoulli trials. However, if the population is large (like all adults in the U.S.) and the sample relatively small (less than 10% of the population), then the sample can be treated like independent Bernoulli trials without much loss. That is, the Bernoulli trials model will still be an acceptably good model (remember, models are never perfect representations of reality, anyway). The Geometric Model Suppose we have independent Bernoulli trials with probability of success p. One random variable we might be interested in is X = the number of the trial on which we first observe a2success. Suppose we are collecting cards of famous athletes from cereal boxes, a “success” might be getting a Tiger Woods’ card and X is the number of boxes we buy until we get our first Tiger’ card. Since 20% of the boxes have Tiger cards, p = .2. We let q = 1 – p = .8 denote the probability of a “failure.” What are the possible values of X? For the Tiger model, what’s P(X = 1)? P(X = 2)? P(X = 10)? What’s a general expression for P(X = x) in terms of p and q?3X follows a geometric model. The geometric model has one parameter: p. We denote the geometric model by “Geom(p).” So the number of boxes needed to get a Tiger card is modeled by a Geom(.2). The probabilities look like this (they keep going beyond x=20, but the probabilities keep getting smaller and smaller): 1 2 3 4 5 6 7 8 9 10111213141516171819200.0 0.05 0.10 0.15xProbabilityGeom(.2) The expected value of a random variable X with model Geom(p) is ∑=++++===ppqpqqppxxP1...4321)X(E(X)32 (See the text for a proof of this using geometric series.) The standard deviation is 2/ pq=σ. Hence, the expected number of boxes we need to but to get a Tiger card is 1/.2 = 5 boxes. The standard deviation is 2)2/(.8.= 4.47. What’s the probability we will have to buy at least 10 boxes to get a Tiger card? Binomial Model Suppose we have independent Bernoulli trials with constant probability of success p. However, suppose now that the number of trials n is fixed and that the random variable X is the number of successes in the n trials. Then X follows a Binomial model. The binomial model has two parameters: n and p. It’s denoted by Binom(n,p).4What are the possible values of X? The binomial probability model is , nxqpxnxXPxnx,...,2,1,0,)( =⎟⎟⎠⎞⎜⎜⎝⎛==−where )!(!!xnxnxn−=⎟⎟⎠⎞⎜⎜⎝⎛ and q = 1 – p. The mean is: npXE == )(µ The standard deviation is: npqXSD == )(σ. Example: Tiger again. Suppose we buy 5 boxes of cereal. What’s the expected number of Tigers? What’s the standard deviation? What’s the probability that we get exactly one Tiger? What’s the probability that we get two or fewer Tigers? 0123450.0 0.1 0.2 0.3 0.4xProbabilityBinom(5,.2)5Deriving the mean and standard deviation of a binomial random variable To derive the mean and standard deviation for the binomial model, start with a single Bernoulli trial with probability of success p. Let X be the number of successes on this single trial so X is either 1 or 0 and it’s 1 with probability p and 0 with probability q=1-p. X is sometimes said to be a Bernoulli random variable. What are E(X) and Var(X)? Now, suppose we have n independent Bernoulli trials. The number of successes on the n trials is Y = X1 + X2 + …+ Xn where each of the X’s is a Bernoulli trial. So, E(Y) = Var(Y) = SD(Y) = The Normal model The normal distribution is an idealized model that is often used for distributions that are unimodal and roughly symmetric: “mound-shaped”. There are often advantages to using a model for a distribution rather than the distribution itself. One is that a normal model is completely characterized by two parameters, µ and σ, which are the mean and standard deviation of the normal model. That is, there is a normal model for every possible value of µ and for every value of σ > 0. A model is not useful unless it is flexible enough to be used in a variety of situations and, by choosing the values of µ and σ, we can use the normal distribution to model SAT scores and heights of U.S. adult women in inches, as long as the

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-3 out of 9 pages.

UF STA 6166 - Probability Models

Sign up for free to view:

Please select your school