DOC PREVIEW
Duke STA 101 - Bayesian Slides

This preview shows page 1-2-14-15-29-30 out of 30 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 30 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Additional Slides on Bayesian Statistics for STA 101Can we use this method to learn about means and percentages?Combining the prior beliefs and the data using Bayes RuleEstimation of unknown parameters in statistical models (Bayesian and non-Bayesian)Estimating percentage of Dukies who plan to get advanced degreeEstimating the average IQ of Duke professorsMaximum likelihood estimation: A principled approach to estimationMaximum likelihood estimationSlide 9Advanced degree exampleMLE for degree exampleMaximum likelihoodFinding the MLE for degree exampleSlide 14Model for Professors’ IQsModel for all 25 IQsSlide 17Likelihood function and maximum likelihood estimatesThe Bayesian approach to estimation of meansSlide 20Formalizing a model for prior informationMathematical equation for normal curveModel for the data (25 IQs)Slide 24Slide 25Posterior distributionSlide 27Using the posterior distribution to summarize beliefs about µBayesian statistics in generalDifferences between frequentist and BayesianAdditional Slides on Bayesian Statistics for STA 101Prof. Jerry ReiterFall 2008Can we use this method to learn about means and percentages?•To learn about population averages and percentages, we’ve used data (like the DNA test results), but not prior information (like the list of suspects).•We show how to combine data and prior information in class.Combining the prior beliefs and the data using Bayes Rule•In Bayes rule problem before break, we combine the prior beliefs and the data using Bayes rule.•Pr(p|X=1) represents our posterior beliefs about µ .)1Pr()Pr()|1Pr()1|Pr(XppXXpEstimation of unknown parameters in statistical models (Bayesian and non-Bayesian)•Suppose we posit a probability distribution to model data. How do we estimate its unknown parameters?•Example: assume data follow regression model. Where do the estimates of the regression coefficients come from?•Classical statistics: maximum likelihood estimation.•Bayesian statistics: Bayes rule.Estimating percentage of Dukies who plan to get advanced degree•Suppose we want to estimate the percentage of Duke students who plan to get an advanced degree (MBA, JD, MD, PhD, etc.). Call this percentage p.•We sample 20 people at random, and 8 of them say they plan to get an advanced degree.•What should be our estimate of p?Estimating the average IQ of Duke professors•Let µ be the population average IQ of Duke profs.•Suppose we randomly sample 25 Duke profs and record their IQs.•What should be our estimate of µ?.01.05.10.25.50.75.90.95.99-2-10123Normal Quantile Plot100 110 120 130 140 150 160 170MeanStd DevStd Err Meanupper 95% Meanlow er 95% MeanN 132.1611.7106792.3421358136.99393127.32607 25MomentsProf IQs (hypothetical data)DistributionsMaximum likelihood estimation: Aprincipled approach to estimation•Usually we can use subject-matter knowledge to specify a distribution for the data. But, we don’t know the parameters of that distribution.1) Number out of 20 who want advanced degree: binomial distribution.2) Profs’ IQs: normal distribution.Maximum likelihood estimation•We need to estimate the parameters of the distribution.Why do we care? A) So we can make probability statements about future events.B) The parameters themselves may be important.Maximum likelihood estimation•The maximum likelihood estimate of the unknown parameter is the value for which the data were most likely to have occurred.•Let’s see how this works in the examples.Advanced degree example•Let Y be the random variable for the number of people out of 20 that plan to get an advanced degree.•Y has a binomial distribution with n = 20, and unknown probability p.•In the data, Y= 8 . If we knew p, the value of the probability distribution function at Y= 8 would be:8208)1()! 12)(! 8 (!20) 8 Pr( ppYMLE for degree example•Let’s graph Pr(Y = 8) as a function of the unknown p. •Label the function L(p). L(p) is called the likelihood function for p.Maximum likelihood•The maximum likelihood estimate of p is the value of p that maximizes L(p). •This is a reasonable estimate because it is the value of p for which the observed data (y= 8 ) had the greatest chance of occurring.Finding the MLE for degree example•To maximize the likelihood function, we need to take the derivative of with respect to p, set it equal to zero, and finally solve for p. You get the sample percentage!8208)1()! 12 )(! 8 (!20)( pppLEstimating the average IQ of Duke professors•Let µ be the population average IQ of Duke profs.•Suppose we randomly sample 25 Duke profs and record their IQs.•What should be our estimate of µ?.01.05.10.25.50.75.90.95.99-2-10123Normal Quantile Plot100 110 120 130 140 150 160 170MeanStd DevStd Err Meanupper 95% Meanlow er 95% MeanN 132.1611.7106792.3421358136.99393127.32607 25MomentsProf IQs (hypothetical data)DistributionsModel for Professors’ IQs•The mathematical function for a normal curve for any prof’s IQ, which we label Y, is: •All normal curves have this form, with different means and SDs. Here, we’ll assume the σ = 15. We don’t know µ, which is what we’re after.222/)(21)(yeyfModel for all 25 IQs•We need the function for all 25 IQs. •Assuming each prof’s IQ is independent of other profs’ IQs, we have)15(2/)(25125212521222151 )(...)()(),...,,(iyieyfyfyfyyyfModel for all 25 IQs•With some algebra and simplifications, the likelihood function is:25122)15(2/)(25252151 )(iiyeLLikelihood function and maximum likelihood estimates•A graph of the likelihood function looks something like this:•The function is maximized when µ is the sample average. So, we use 132.16 as our estimate of the average Duke prof’s IQ.•This sample average is the MLE for µ in any normal curve.The Bayesian approach to estimation of means•Let’s show how to combine data and prior information to address the following motivating question: What is a likely range for the average IQ of Duke professors?Combining the prior beliefs and the data using Bayes Rule•We combine our prior beliefs and the data using Bayes rule.•f(µ|data) represents our posterior beliefs about µ .)()()|()|(dataffdatafdatafFormalizing a model for prior information•Let’s assign a distribution for µ that reflects our a priori beliefs about its


View Full Document

Duke STA 101 - Bayesian Slides

Download Bayesian Slides
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Bayesian Slides and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Bayesian Slides 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?