DOC PREVIEW
Purdue CS 59000 - Lecture Notes

This preview shows page 1-2-3-4-29-30-31-32-59-60-61-62 out of 62 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 62 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 62 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 62 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 62 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 62 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 62 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 62 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 62 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 62 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 62 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 62 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 62 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 62 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS 59000 Statistical Machine learning Lecture 5Outline ML Parameter Estimation for Bernoulli (1)ML Parameter Estimation for Bernoulli (2)ML Parameter Estimation for Bernoulli (2)Beta DistributionBeta DistributionBayesian BernoulliPrior ∙ Likelihood = PosteriorProperties of the PosteriorPrediction under the PosteriorMultinomial VariablesML Parameter estimationThe Multinomial DistributionThe Dirichlet DistributionBayesian Multinomial Question Prediction under the PosteriorThe Gaussian DistributionCentral Limit Theorem Geometry of the Multivariate GaussianMoments of the Multivariate Gaussian (1)Moments of the Multivariate Gaussian (2)Partitioned Gaussian DistributionsPartitioned Conditionals and MarginalsPartitioned Conditionals and MarginalsBayes’ Theorem for Gaussian VariablesMaximum Likelihood for the Gaussian (1)Maximum Likelihood for the Gaussian (2)Maximum Likelihood for the Gaussian (3)Sequential EstimationBayesian Inference for the Gaussian (1)Bayesian Inference for the Gaussian (2)Bayesian Inference for the Gaussian (3)Bayesian Inference for the Gaussian (4)Bayesian Inference for the Gaussian (5)Bayesian Inference for the Gaussian (6)Bayesian Inference for the Gaussian (7)Bayesian Inference for the Gaussian (8)Bayesian Inference for the Gaussian (9)Bayesian Inference for the Gaussian (10)Bayesian Inference for the Gaussian (11)Bayesian Inference for the Gaussian (12)Student’s t-Distribution (1)Student’s t-Distribution (2)Student’s t-Distribution (3)Student’s t-Distribution (4)Mixtures of Gaussians (1)Mixtures of Gaussians (2)Mixtures of Gaussians (3)Mixtures of Gaussians (4)The Exponential Family (1)The Exponential Family (2.1)The Exponential Family (2.2)The Exponential Family (3.1)The Exponential Family (3.2)The Exponential Family (3.3)The Exponential Family (4)ML for the Exponential Family (1)ML for the Exponential Family (2)Conjugate priorsPosterior of Gaussian mean parameterCS 59000 Statistical Machine learningLecture 5Alan QiOutlineReview of ML and Bayesian estimation of Bernoulli distributionsML and Bayesian estimation of multinomial and Gaussian distributionst‐distributions and mixture of GaussiansExponential familyML Parameter Estimation for Bernoulli (1)Given: ML Parameter Estimation for Bernoulli (2)Example:Prediction: all future tosses will land heads upAny concern about this prediction?ML Parameter Estimation for Bernoulli (2)Example:Prediction: all future tosses will land heads upOverfitting to DBeta DistributionDistribution over .Beta DistributionBayesian BernoulliThe Beta distribution provides the conjugate prior for the Bernoulli distribution.Prior ∙ Likelihood = PosteriorProperties of the PosteriorAs the size of the data set, N , increasePrediction under the PosteriorWhat is the probability that the next coin toss will land heads up? Predictive posterior distributionMultinomial Variables1‐of‐K coding scheme:ML Parameter estimationGiven:Ensure , use a Lagrange multiplierThe Multinomial DistributionThe Dirichlet DistributionConjugate prior for the multinomial distribution.Bayesian Multinomial QuestionSuppose we toss a coin and observe a head, what will be the probability to see another head, based on Bayesian prediction (using uniform prior), if we toss the coin again?Prediction under the PosteriorWhat is the probability that the next coin toss will land heads up? Predictive posterior distributionThe Gaussian DistributionCentral Limit Theorem The distribution of the sum of N i.i.d. random variables becomes increasingly Gaussian as Ngrows.Example: N uniform [0,1]random variables.Geometry of the Multivariate GaussianMoments of the Multivariate Gaussian (1)thanks to anti‐symmetry of zMoments of the Multivariate Gaussian (2)Partitioned Gaussian DistributionsPartitioned Conditionals and MarginalsPartitioned Conditionals and MarginalsBayes’ Theorem for Gaussian VariablesGivenwe havewhereMaximum Likelihood for the Gaussian (1)Given i.i.d. data , the log likeli‐hood function is given bySufficient statisticsMaximum Likelihood for the Gaussian (2)Set the derivative of the log likelihood function to zero,and solve to obtainSimilarlyMaximum Likelihood for the Gaussian (3)Under the true distributionHence define Is it biased?Contribution of the Nthdata point, xNSequential Estimationcorrection given xNcorrection weightold estimateBayesian Inference for the Gaussian (1)Assume σ2is known. Given i.i.d. data, the likelihood function forμ is given byThis has a Gaussian shape as a function of μ (but it is not a distribution over μ).Bayesian Inference for the Gaussian (2)Combined with a Gaussian prior over μ,this gives the posteriorCompleting the square over μ, we see thatBayesian Inference for the Gaussian (3)… whereNote:Bayesian Inference for the Gaussian (4)Example: for N= 0, 1, 2 and 10.Data points are sampled from a Gaussian of mean 0.8 & variance 0.1Bayesian Inference for the Gaussian (5)Sequential EstimationThe posterior obtained after observing N— 1data points becomes the prior when we observe the Nthdata point.Bayesian Inference for the Gaussian (6)Now assume μ is known. The likelihood function for λ = 1/σ2is given byThis has a Gamma shape as a function of λ.Bayesian Inference for the Gaussian (7)The Gamma distributionBayesian Inference for the Gaussian (8)Now we combine a Gamma prior, ,with the likelihood function for λ to obtainwhich we recogniz e as with Bayesian Inference for the Gaussian (9)If both μ and λ are unknown, the joint likelihood function is given byWe need a prior with the same functional dependence on μ and λ.Bayesian Inference for the Gaussian (10)The Gaussian‐gamma distribution• Quadratic in μ.• Linear in λ.• Gamma distribution over λ.• Independent of μ. Bayesian Inference for the Gaussian (11)The Gaussian‐gamma distributionBayesian Inference for the Gaussian (12)Multivariate conjugate priors• μ unknown, Λ known: p(μ )Gaussian.• Λ unknown, μ known: p(Λ )Wishart,• Λ and μ unknown: p(μ ,Λ)Gaussian‐Wishart,Student’s t‐Distribution (1)If we integrate out the precision of a Gaussian with a Gamma prior, we obtainSetting and , we haveStudent’s t‐Distribution (2)Student’s t‐Distribution (3)Robustness to outliers: Gaussian vs t‐distribution.Student’s t‐Distribution (4)The D


View Full Document

Purdue CS 59000 - Lecture Notes

Documents in this Course
Lecture 4

Lecture 4

42 pages

Lecture 6

Lecture 6

38 pages

Load more
Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?