Unformatted text preview:

Random variables 9.07 2/19/2004A few notes on the homework • If you work together, tell us who you’re workingwith. – You should still be generating your own homework solutions. Don’t just copy from your partner. We want to see your own words. • Turn in your MATLAB code (this helps us give you partial credit) • Label your graphs – xlabel(‘text’) – ylabel(‘text’) – title(‘text’)More homework notes • Population vs. sample – The population to which the researcher wants to generalize can be considerably more broad than might be implied by the narrow sample. • High school students who take the SAT • High school students • Anyone who wants to succeed • AnyoneMore homework notes • MATLAB: – If nothing else, if you can’t figure out something in MATLAB, find/email a TA, or track down one of the zillions of fine web tutorials. – Some specifics…MATLAB • Hint: MATLAB works best if you can think of your problem as an operation on a matrix. Do this instead of “for” loops, when possible. – E.G. coinflip example w/o for loops x = rand(5,10000); coinflip = x>0.5; numheads = sum(coinflip); % num H in 5 flipsMATLAB • randn(N) -> NxN matrix! • randn(1,N) -> 1xN matrix • sum(x) vs. sum(x,2) • hist(data, 1:10) vs. hist(data, 10) • plot(hist(data)) vs. [n,x]=hist(data); plot(x,n)A few more comments • Expected value can tell you whether or not you want to play game even once. – It tells you if the “game” is in your favor. • In our example of testing positive for a disease, P(D) is the prior probability that you have the disease. What was the probability of you having the disease before you got tested? If you are from a risky population, P(D) may be higher than 0.001. Before you took the test you had a higher probability of having the disease, so after you test positive, your probability of having the disease, P(D|+) will be higher than 1/20.Random Variables • Variables that take numerical values associated with events in an experiment – Either discrete or continuous • Integral (not sum) in equations below for continuous r.v. – Mean, µ, of a random variable is the sum of eachpossible value multiplied by its probability: µ = ∑xiP(xi) ≡ E(x) • Note relation to “expected value” from last time. – Variance is the average of squared deviations multiplied by the probability of each value – σ2= ∑(xi-µ)2P(xi) ≡ E((x-µ)2)We’ve already talked about a few special cases • Normal r.v.’s (with normal distributions) • Uniform r.v.’s (with distributions like this:) p x •Etc.Random variables • Can be made out of functions of other random variables. • X r.v., Y r.v. -> Z=X+Y r.v. Z=sqrt(X)+5Y + 2 r.v.Linear combinations of random variables • We talked about this in lecture 2. Here’s a review,with new E() notation. •Assume: – E(x) = µ –E(x-µ)2 = E(x2-2µx+µ2) = σ2 • E(x+5) = E(x) + E(5) = E(x) + 5 = µ + 5 = µ’ • E((x+5-µ’)2) = E(x2+2(5-µ’)x + (5-µ’)2) = E(x2-2µx+µ2) = σ2= (σ’)2 Adding a constant to x adds that constant to µ, butleaves σ unchanged.Linear combinations of random variables • E(2x) = 2E(x) = 2µ = µ’ • E((2x-µ’)2) = E(4x2 –8xµ + 4µ2) = 4σ2= (σ’)2 σ’ = 2σ Scaling x by a constant scales both µ and σ by that constant. But…Multiplying by a negative constant • E(-2x) = 2E(x) = -2µ = µ’ • E((-2x-µ’)2) = E(4x2 +2(2x)(-2µ) + (-2µ)2) = E(4x2 –8xµ + 4µ2) = 4σ2= (σ’)2 σ’ = 2σ Scaling by a negative number multiples the mean by that number, but multiplies the standard deviation by –(the number). (Standard deviation is always positive.)What happens to z-scores when you apply a transformation? • Changes in scale or shift do not change “standard units,” i.e. z-scores. – When you transform to z-scores, you’re already subtracting off any mean, and dividing by any standard deviation. If you change the mean or standard deviation, by a shift or scaling, the new mean (std. dev.) just gets subtracted (divided out).Special case: Normal random variables • Can use z-tables to figure out the area under part of a normal curve.An example of using the table What % • P(-0.75<z<0.75) = here and here 0.5467 • P(z<-0.75 or -.75 0.75 z>0.75) = 1-0.5467 z Height Area ≈ 0.45 … … … 0.70 31.23 51.61• That’s our answer. 0.75 30.11 54.67 0.80 28.97 57.63 … … …Another way to use the z-tables • Mean SAT score = 500, std. deviation = 100 • Assuming that the distribution of scores is normal, what is the score such that 95% of the scores are below that value? 5%95% z = ?Using z-tables to find the 95 percentile point 5%5% 90% • From the tables: z Height Area 1.65 10.23 90.11 • z=1.65 -> x=? Mean=500, s.d.=100 • 1.65 = (x-500)/100; x = 165+500 = 665Normal distributions • A lot of data is normally distributed because of the central limit theorem from last time. – Data that are influenced by (i.e. the “sum” of) many small and unrelated random effects tend to be approximately normally distributed. – E.G. weight (I’m making up these numbers) • Overall average = 120 lbs for adult women • Women add about 1 lb/year after age 29 • Illness subtracts an average of 5 lbs • Genetics can make you heavier or thinner • A given “sample” of weight is influenced by being an adult woman, age, health, genetics, …Non-normal distributions • For data that is approximately normally distributed, we can use the normal approximation to get useful information about percent of area under some fraction of the distribution. • For non-normal data, what do we do?Non-normal distributions • E.G. income distributions tend to be very skewed • Can use percentiles, much like in the last z- table example (except without the tables) – What’s the 10th percentile point? The 25th percentile point?Percentiles & interquartile range • Divide data into 4 groups, see how far about the extreme groups are. Median = 50th percentile median=Q1 median=Q3 = 25th percentile = 75th percentile • Q3-Q1 = IQR = 75th percentile – 25th percentileWhat do you do for other percentiles? • Median = point such that 50% of the data lies below that point • Similarly, 10th percentile = point such that 10% of the data lies below that point.What do you do for other percentiles? • If you have a theory for the


View Full Document

MIT 9 07 - Random variables

Download Random variables
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Random variables and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Random variables 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?