UMass Amherst PSYCH 240 - Foundation of Statistical Inference

Unformatted text preview:

Foundation of Statistical Inference Unit 3 Prof Staub Inferential Statistics we have data from a sample and we want to draw inferences about a population Probability is the branch of mathematics that provides the basis for statistical inference Things that have probability Processes that can have various outcomes where we cannot know in advance what outcome the process will have We can t know this because which specific outcome the process has is due to chance This is called random process The outcomes of random processes have probabilities Examples of random processes Rolling two dice playing the lottery Possible Outcomes getting a sum of exactly 10 winning 1 000 000 or more The probability of an outcome is the long run proportion of times it occurs it s the proportion of times the outcome would occur if we were to run the random process an infinite number of times of times outcome occurs of times we run the process The law of large numbers p hat was run number of times the outcome occurs over the total number of times the process the proportion of times that the outcome occurs in n trials of our random process The law of large numbers simply says that as n increases converges to gets closer and closer to the true probability of the outcome p More concretely this says that if we run a random process only a few times the proportion of times we get an outcome may be much higher than the true probability of that outcome or it may be much lower But as we run the process more and more times the proportion of times we get that outcome will get close to the probability p We abbreviate the probability of an outcome with p outcome Example Probability of getting a six on a single die p 6 167 Proportions must It is restricted to arrange between 0 and 1 inclusive 0 p 1 There are no negative probabilities and there are no probabilities greater than 1 The sum of the probabilities of the various possible outcomes of a random process must be equal to 1 The probability of an outcome plus the probability of not getting that outcome is always 1 CQ Which of these things is a valid probability A p outcome 7 1 bigger than 1 B p outcome 0 2 probability can not be negative C p outcome eleventeen not even real number D p outcome 0 7 A random process has four possible outcomes p outcome1 0 2 Which of the following must be true A p outcome2 0 2 B p outcome2 p outcome3 p outcome 4 0 8 the outcomes must add up to 1 C p outcome2 0 D p outcome2 p outcome3 p outcome 4 0 8 The next two rules apply if two outcomes are independent One occurring does not affect the probability that the other will occur Multiplication rule If two outcomes are independent the probability of A and B is equal to the probability of A times the probability of B p A and B p A p B Addition rule If two outcomes are independent the probability of A or B is equal to the probability of A plus the probability of B minus the probability of both p A or B p A p B p A and B Probability distribution instead of keeping track of how many times each value actually occurs we represent the probability of each value i e each outcome occurring Since these are discrete variables the bar graphs will have spaces in between They must also always add up to one So far we ve been talking about probability distributions where it is possible to identify a specific set of possible outcomes like the sum of dots on two dice or the number of heads you get when you flip two coins We call this a discrete probability distribution But what if a random process could result in a potentially infinite number of outcomes For this we assign probability to a range of outcomes We use a graph called a probability density function or a pdf or a density curve to show the probability distribution when the outcomes of a random process are continuous rather than discrete The area under the curve is always 1 Not all pdfs will have a singular peek When we put a curve on a histogram we ve gone from thinking about the actual number of people in each height bin to thinking about the probability that if we were to generate an outcome by selecting a person at random that person would be in that bin named distributions that are very important in statistics discrete probability distribution called the binomial distribution continuous probability distribution called the Normal distribution This is a particular example of the binomial distribution More generally the binomial distribution is the probability distribution of the number of times that an outcome happens in n trials of a random process if the outcome has a fixed probability For binomial distribution the outcome we re interested in is often called a success even if it isn t an especially good thing Remember binomial distributions doe not need to be symmetrical The dbinom function which stands for density of the binomial takes three arguments The number of successes I m interested in 5 the total number of trials 20 and the probability of a success on each trial 1 pbinom gives you what s called the cumulative probability This is the probability of having as many successes as you specify or fewer Normal Distribution Bell Curve It is a continuous probability distribution with a specific shape one hump in the middle it s a unimodal distribution and the hump is in the middle of the range symmetrical outcomes on either side of the middle are equally likely Specifically the probability of an outcome between 1 standard deviation above and 1 standard deviation below the mean is 68 The probability of an outcome between 2 standard deviations above and 2 standard deviation below the mean is 95 The probability of an outcome between 3 standard deviations above and 3 standard deviation below the mean is 997 This is sometimes called the 68 95 99 7 rule if you think about it in terms of percent rather than probability The mean 18 ans sd 2 The probability of an outcome between 16 1 sd below the mean and 20 1 sd above the mean is 68 A Normal distribution not only has one hump and is symmetrical it also has a very specific shape When a distribution looks like the dashed one too humpy we say that it has negative kurtosis When a distribution looks like the solid one too peaky we say that it has positive kurtosis Examples of normal distribution body temperature or blood pressure of healthy people scores on most standardized tests pnorm function tells you the probability to the left of a given value i e the cumulative probability These two values 162 and 071


View Full Document

UMass Amherst PSYCH 240 - Foundation of Statistical Inference

Download Foundation of Statistical Inference
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Foundation of Statistical Inference and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Foundation of Statistical Inference 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?