DOC PREVIEW
CMU CS 10701 - NONPARAMETRIC DENSITY ESTIMATION

This preview shows page 1-2-22-23 out of 23 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 23 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Chapter 6 NONPARAMETRIC DENSITY ESTIMATION So far we have been discussing the estimation of parameters. Thus, if we can assume we have a density function that can be characterized by a set of parameters, we can design a classifier using estimates of the parameters. Unfortunately, we often cannot assume a parametric form for the density func- tion, and in order to apply the likelihood ratio test we somehow have to esti- mate the density functions using an unstructured approach. This type of approach is called nonparametric estimation, while the former is called parametric estimation. Since, in nonparametric approaches, the density func- tion is estimated locally by a small number of neighboring samples, the esti- mate is far less reliable with larger bias and variance than the parametric coun- terpart. There are two kinds of nonparametric estimation techniques available: one is called the Par-zen density estimate and the other is the k-nearest neigh- bor- densiry estimate. They are fundamentally very similar, but exhibit some different statistical properties. Both are discussed in this chapter. It is extremely difficult to obtain an accurate density estimate non- parametrically, particularly in high-dimensional spaces. However, our goal here is not to get an accurate estimate. Our goal is, by using these estimates, to design a classifier and evaluate its performance. For this reason, the accu- racy of the estimate is not necessarily a crucial issue. Classification and performance evaluation will be discussed in Chapter 7. The intention of this 2546 Nonparametric Density Estimation 255 chapter is to make the reader familiar with the fundamental mathematical pro- perties related to nonparametric density estimation in preparation for the material presented in Chapter 7. 6.1 Parzen Density Estimate Parzen Density Estimate In order to estimate the value of a density function at a point X, we may set up a small local region around X, L (X). Then, the probability coverage (or probability mass) of L(X) may be approximated by p(X)v where v is the volume of L(X). This probability may be estimated by drawing a large number of samples, N, from p(X), counting the number of samples, k, falling in L (X), and computing k/N. Equating these two probabilities, we may obtain an estimate of the density function as n n k(X) or p(X)= - . k(X) p(X)v = - N Nv Note that, with a fixed v, k is a random variable and is dependent on X. A fixed v does not imply the same v throughout the entire space, and vv could still vary with X. However, v is a preset value and is not a random variable. Kernel expression: The estimate of (6.1) has another interpretation. Suppose that 3 samples, X3, X,, and X,, are found in L(X) as shown in Fig. 6-1. With I' and N given, i(X) becomes 3/Nv. On the other hand, if we set up a uniform kernel function, IC(.), with volume v and height l/v around all exist- ing samples, the average of the values of these kernel functions at X is also 3/Nv. That is, [ 1-41 n (6.2) As seen in Fig. 6-1, only the kernel functions around the 3 samples, X3, X4, and X,, contribute to the summation of (6.2). Once (6.2) is adopted, the shape of the kernel function could be selected more freely, under the condition K(X) dX = 1. For one-dimensional cases, we may seek optimality and select a complex shape. However, in a high- dimensional space, because of its complexity, the practical selection of the ker- lN N ;=I p(x) = - K(X - x;) .256 Introduction to Statistical Pattern Recognition Fig. 6-1 Parzen kernel density estimate. ne1 function is very limited to either a normal or uniform kernel. In this book, we will use the following kernel which includes both normal and uniform ker- nels as special cases: where r(.) is the gamma function, and m is a parameter determining the shape of the kernel. It may be verified that, for any value of m, the covariance matrix of the kernel density (6.3) is r2A. The parameter rn determines the rate at which the kernel function drops off. For m = 1, (6.3) reduces to a simple nor- mal kernel. As m becomes large, (6.3) approaches a uniform (hyperelliptical) kernel, always with a smooth roll-off. The matrix A determines the shape of the hyperellipsoid, and I' controls the size or volume of the kernel. Other coefficients are selected to satisfy the two conditions mentioned previously:6 Nonparametric Density Estimation 257 j~(x)dX = 1 and Z, = r’A where C, is the covariance matrix of K(X). Convolution expression: Equation (6.2) can be rewritten in convolution form as where p,T is an impulsive density function with impulses at the locations of existing N samples. That is, the estimated density p(X) is obtained by feeding p,(X) through a linear (noncausal) filter whose impulse response is given by K(X). Therefore, p(X) is a smoothed version of p,(X). n Moments of p(X): The first and second order moments of (6.4) can be n easily computed. First, let us compute the expected value of p,(X) as (6.6) That is, p,(X) is an unbiased estimate of p(X). Then, the expected value of p(X) of (6.4) may be computed as IN lN N,=I N ;=, E(P,(X)J = -Zj6(X-Z)P(Z)dZ = -ZP(X) =p(X). Also,258 Introduction to Statistical Pattern Recognition Therefore, the variance of p(X) is Approximations of moments: In order to approximate the moments of p(X), let us expand p(Y) around X by a Taylor series up to the second order terms as p(~) E~(x) + V~'(X)(Y-X) + -~~{v~~(x)(Y-x)(Y-x)'] . Then, p (X)*K(X) may be approximated by A (6.10) 1 2 p(X)*K(X) = jp(Y)K(Y-x)dY gp (X)jK(Y-X)dY (6.1 1) + -tr{ v2p (X)j(Y -X)(Y -X)'K(Y -X)dY ) , where the first order term disappears because K(.) is a symmetric function. Since ~K(Y-x)~Y = 1 and ~(Y-x)(Y-x)'K(Y-x)~Y = r.2~ for K(.) of (6.3), (6.1 1) can be expressed by 1 2 where (6.12) (6.13) Similarly, p(X)*d(X) Ep(X)jt?(Y-X)dY (6.14) Although K(.) is a density function, K*(.) is not. Therefore, ld(Y)dY has a value not equal to 1. Let + -~~(v~~(x)~(Y-x)(Y-x)~(Y-x)~Y 1 1 . 26 Nonparametric Density Estimation 259 w =jl?(Y)dY. Then, $(.)/w becomes a density function. Therefore, (6.14) becomes where and r-'B is the covariance matrix of ?(X)/w. (6.15) (6.16) (6.17) Substituting (6.12) and (6.16) into (6.7) and (6.9), the moments of p(X) are


View Full Document

CMU CS 10701 - NONPARAMETRIC DENSITY ESTIMATION

Documents in this Course
lecture

lecture

12 pages

lecture

lecture

17 pages

HMMs

HMMs

40 pages

lecture

lecture

15 pages

lecture

lecture

20 pages

Notes

Notes

10 pages

Notes

Notes

15 pages

Lecture

Lecture

22 pages

Lecture

Lecture

13 pages

Lecture

Lecture

24 pages

Lecture9

Lecture9

38 pages

lecture

lecture

26 pages

lecture

lecture

13 pages

Lecture

Lecture

5 pages

lecture

lecture

18 pages

lecture

lecture

22 pages

Boosting

Boosting

11 pages

lecture

lecture

16 pages

lecture

lecture

20 pages

Lecture

Lecture

20 pages

Lecture

Lecture

39 pages

Lecture

Lecture

14 pages

Lecture

Lecture

18 pages

Lecture

Lecture

13 pages

Exam

Exam

10 pages

Lecture

Lecture

27 pages

Lecture

Lecture

15 pages

Lecture

Lecture

24 pages

Lecture

Lecture

16 pages

Lecture

Lecture

23 pages

Lecture6

Lecture6

28 pages

Notes

Notes

34 pages

lecture

lecture

15 pages

Midterm

Midterm

11 pages

lecture

lecture

11 pages

lecture

lecture

23 pages

Boosting

Boosting

35 pages

Lecture

Lecture

49 pages

Lecture

Lecture

22 pages

Lecture

Lecture

16 pages

Lecture

Lecture

18 pages

Lecture

Lecture

35 pages

lecture

lecture

22 pages

lecture

lecture

24 pages

Midterm

Midterm

17 pages

exam

exam

15 pages

Lecture12

Lecture12

32 pages

lecture

lecture

19 pages

Lecture

Lecture

32 pages

boosting

boosting

11 pages

pca-mdps

pca-mdps

56 pages

bns

bns

45 pages

mdps

mdps

42 pages

svms

svms

10 pages

Notes

Notes

12 pages

lecture

lecture

42 pages

lecture

lecture

29 pages

lecture

lecture

15 pages

Lecture

Lecture

12 pages

Lecture

Lecture

24 pages

Lecture

Lecture

22 pages

Midterm

Midterm

5 pages

mdps-rl

mdps-rl

26 pages

Load more
Download NONPARAMETRIC DENSITY ESTIMATION
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view NONPARAMETRIC DENSITY ESTIMATION and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view NONPARAMETRIC DENSITY ESTIMATION 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?