DOC PREVIEW
UCSD CSE 190 - Non-Parametric Density Estimation Techniques

This preview shows page 1-2 out of 6 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1CSE190a Fall 06Non-Parametric Density Estimation TechniquesBiometricsCSE 190-aLecture 7CSE190a Fall 06Announcements• HW1 assigned• Most of this lecture was on the blackboard. These slides cover the same material as presented in DHSPattern Pattern ClassificationClassificationAll materials in these slides were taken from All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Pattern Classification (2nd ed) by R. O. DudaDuda, P. E. Hart and D. G. Stork, John Wiley , P. E. Hart and D. G. Stork, John Wiley & Sons, 2000& Sons, 2000with the permission of the authors and the with the permission of the authors and the publisherpublisherChapter 4 (Part 1):Chapter 4 (Part 1):NonNon--Parametric Classification Parametric Classification (Sections 4.1(Sections 4.1--4.3)4.3)••IntroductionIntroduction••Density EstimationDensity Estimation••ParzenParzenWindowsWindowsPattern Classification, Ch4 (Part 1)4IntroductionIntroduction••All Parametric densities are All Parametric densities are unimodalunimodal(have a single local (have a single local maximum), whereas many practical problems involve multimaximum), whereas many practical problems involve multi--modal densitiesmodal densities••Nonparametric procedures can be used with arbitrary Nonparametric procedures can be used with arbitrary distributions and without the assumption that the forms of distributions and without the assumption that the forms of the underlying densities are knownthe underlying densities are known••There are two types of nonparametric methods:There are two types of nonparametric methods:••Estimating PEstimating P(x | (x | ωωj j ) ) ••Bypass probability and go directly to aBypass probability and go directly to a--posteriori probability posteriori probability estimationestimationPattern Classification, Ch4 (Part 1)5Density EstimationDensity Estimation••Basic idea:Basic idea:••Probability that a vector x will fall in region R is:Probability that a vector x will fall in region R is:••P is a smoothed (or averaged) version of the density P is a smoothed (or averaged) version of the density function p(x) if we have a sample of size n; therefore, the function p(x) if we have a sample of size n; therefore, the probability that k points fall in R is then:probability that k points fall in R is then:and the expected value for k is:and the expected value for k is:E(k) = E(k) = nPnP(3)(3)∫ℜ= (1) 'dx)'x(pP(2) )P1(P knPknkk−−⎟⎟⎠⎞⎜⎜⎝⎛=2Pattern Classification, Ch4 (Part 1)6ML estimation of ML estimation of P = P = θθis reached foris reached forTherefore, the ratio k/n is a good estimate for Therefore, the ratio k/n is a good estimate for the probability P and hence for the density the probability P and hence for the density function p. function p. p(x)p(x)is continuous and that the region is continuous and that the region RRis so is so small that p does not vary significantly within small that p does not vary significantly within it, we can write:it, we can write:where is a point within where is a point within RRand V the volume enclosed by and V the volume enclosed by RR..)|P(MaxkθθPnkˆ≅=θ(4) V)x(p'dx)'x(p∫ℜ≅Pattern Classification, Ch4 (Part 1)7Combining equation (1) , (3) and (4) yields:Combining equation (1) , (3) and (4) yields:Vn/k)x(p ≅Pattern Classification, Ch4 (Part 1)8Density Estimation (cont.)Density Estimation (cont.)••Justification of equation (4)Justification of equation (4)We assume that We assume that p(x)p(x)is continuous and that region is continuous and that region RRis is so small that p does not vary significantly within so small that p does not vary significantly within RR. Since . Since p(x) =p(x) =constant, it is not a part of the sum.constant, it is not a part of the sum.(4) V)x(p'dx)'x(p∫ℜ≅Pattern Classification, Ch4 (Part 1)9Where: Where: μμ((RR))is: is: a surface in the Euclidean space a surface in the Euclidean space RR22a volume in the Euclidean space a volume in the Euclidean space RR33a a hypervolumehypervolumein the Euclidean space in the Euclidean space RRnnSince Since p(x) p(x) ≅≅p(xp(x’’) =) =constant, therefore in the Euclidean constant, therefore in the Euclidean space space RR33::∫∫∫ℜℜℜℜℜ=== )()'x(p'dx)x(1)'x(p'dx)'x(p'dx)'x(pμnVk)x(p andV).x(p'dx)'x(p≅≅∫ℜPattern Classification, Ch4 (Part 1)10••Condition for convergenceCondition for convergenceThe fraction The fraction k/(nVk/(nV))is a space averaged value of is a space averaged value of p(x).p(x).p(x)p(x)is obtained only if V approaches zero.is obtained only if V approaches zero.This is the case where no samples are included in This is the case where no samples are included in RR: : it is an uninteresting case!it is an uninteresting case!In this case, the estimate diverges: it is an In this case, the estimate diverges: it is an uninteresting case!uninteresting case!fixed)n (if 0)x(plim0k ,0V===→∞=≠→)x(plim0k ,0VPattern Classification, Ch4 (Part 1)11••The volume V needs to approach 0 anyway if we want to The volume V needs to approach 0 anyway if we want to use this estimationuse this estimation••Practically, V cannot be allowed to become small since the numbePractically, V cannot be allowed to become small since the number of r of samples is always limitedsamples is always limited••One will have to accept a certain amount of variance in the ratiOne will have to accept a certain amount of variance in the ratio k/no k/n••Theoretically, if an unlimited number of samples is available, wTheoretically, if an unlimited number of samples is available, we can e can circumvent this difficultycircumvent this difficultyTo estimate the density of x, we form a sequence of regionsTo estimate the density of x, we form a sequence of regionsRR11, , RR22,,……containing x: the first region contains one sample, the containing x: the first region contains one sample, the second two samples and so on.second two samples and so on.Let Let VVnnbe the volume of be the volume of RRnn, , kknnthe number of samples falling in the number of samples falling in RRnnand and ppnn(x(x))be the nbe the nththestimate for estimate for p(x):p(x):ppnn(x(x) = () = (kknn/n)/V/n)/Vnn(7)(7)3Pattern Classification, Ch4 (Part 1)12Three necessary conditions should apply if we want Three necessary conditions should apply if we want ppnn(x(x))to converge to to converge to p(x)p(x)There are two different


View Full Document

UCSD CSE 190 - Non-Parametric Density Estimation Techniques

Documents in this Course
Tripwire

Tripwire

18 pages

Lecture

Lecture

36 pages

Load more
Download Non-Parametric Density Estimation Techniques
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Non-Parametric Density Estimation Techniques and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Non-Parametric Density Estimation Techniques 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?