Whitman MATH 350 - Radial Basis Functions

Unformatted text preview:

Chapter 11Radial Basis FunctionsIn this chapter, we begin the modeling of nonlinear relationships in data- inparticular, we build up the ma chinery that will model arbitrary continuousmaps from our domain set in IRnto the desired target set in IRm. The fact thatour theory can handle any relationship is both satisfying (we are approximatingsomething that theoretically exists) and unsatisfying (we might end up modelingthe data and the noise.We’ve brought up these conflicting goals before- We can build more andmore accurate models by increasing the complexity and number of parametersused, but we do so at the e xpense of generalization. We will see this tensionin the s e c tions below, but befo re going further into the nonlinear functions, wefirst want to look at the general process of modeling data with functions.11.1 The P r ocess of Function ApproximationThere are an infinite number of functions that can model a finite s et of domain,range pairings. With that in mind, it will come as no surprise that we will needto make some decisions about what kind of data we’re working with, and whatkinds of functions make sense to us- Although many of the algorithms in thischapter can be thought of as “black boxes”, we still need to always be sure thatwe assume only as much as necessary, and that the outcome makes sense.To begin, consider the data. There are two distinct problems in modelingfunctions from data: Interpolation and Regression. Interpolation is the requir e -ment that the function we’re building, f , models each data point ex actly:f(xi) = yiIn regress ion, we minimize the error between the desired values, yi, and thefunction o utput, f(xi). If we use p data points, then we minimize an errorfunction- for example, the sum of squared error:minαpXi=1kf(xi) −yik2171172 CHAPTER 11. RADIAL BASIS FUNCTIONSwhere the function f depends on the par ameters in α.You may be asking yourself why mor e accuracy is not always better? Theanswer is: it depends. For some problems, the data that you are given isextremely accurate- In thos e cases, accuracy may be of primary importance.But, most often, the data has noise a nd in that case, too much accuracy meansthat you’re modeling the noise .In regression problems, we will always break our data into two groups: atraining set and a testing set. It is up to personal pr e ference, but people tendto use about 20% of the data to build the model (get the model parameters),and use the remaining 80% of the data to test how well the model performs (orgeneralizes). Occasionally, there is a third se t of data required- a validation setthat is also used to choose the model parameters (for example, to prune or addnodes to the network), but we won’t us e this se t here. By using distinct datasets, we hope to get as much accuracy as we can, while still maintaining a gooderror on unseen data- We will see how this works later.11.2 Using Polynomials to Build FunctionsWe may used a fixed set of basis vectors to build up a function- For example,when we construct a Taylor series fo r a function, we are using a polynomial ap-proximations to get be tter and better approximations, and if the error betweenthe ser ies and the function goes to zero as the degree increases , we say that thefunction is analytic.If we have two data points, we can construct a line that interp olates thosepoints. Given three points, we can construct an interpolating parabola. In fact,given p + 1 points, we can construct a degree p polynomial that interpolatesthat data.We begin with a model function, a p olynomial of degr e e p:f(x) = a0+ a1x + a2x2+ . . . apxpand using the data, we construct a set of p + 1 equations (there are p + 1coefficients in a degree p polynomia l):y1= apxp1+ ap−2xp−21+ . . . + a1x1+ a0y2= apxp2+ ap−2xp−22+ . . . + a1x2+ a0y3= apxp3+ ap−2xp−23+ . . . + a1x3+ a0... ···yp+1= apxpp+1+ ap−2xp−2p+1+ . . . + a1xp+1+ a011.2. USING POLYNOMIALS TO BUILD FUNCTIONS 173And writing this as a matr ix-vector eq uation:xp1xp−11xp−21. . . x11xp2xp−12xp−22. . . x21.........xpp+1xp−1p+1xp−2p+1. . . xp+11apap−1...a1a0=y1y2...ypyp+1⇒ V a = ywhere V is called the Vandermonde matrix associated with the data pointsx1, . . . , xp. For future reference, note that in Matlab, there is a built in function,vander, but we could build V by taking the domain data as a single columnvector x, then form the p + 1 columns as powers of this vector:V=[x.^p x.^(p-1) ... x.^2 x x.^0]The matrix V is square, so it may be invertible. In fact,Theorem: If the data points x1, . . . , xpare all distinct, then the Vandermondematrix is invertible.1This theorem tells us that we can solve the interpolation problem by simplyinverting the Vandermonde matrix:a = V−1yUsing polynomials can be bad for interpolation, as we explore more dee ply in acourse in Numerical Analysis. For now, consider interpolating a fairly nice andsmooth function,f(x) =11 + x2over eleven evenly spaced points between (and including) -5 and 5. Constructthe degree 10 polynomial, and the result is shown in Figure 11.1. The graphshows the actual function (black curve), the interpolating points (black aster-isks), and the degree 10 polynomial between interpolating points (dashed curve).As we can see, there is some really bad generalization- And these wild oscilla-tions we see are typical of polynomials of large degree.To partially solve this bad generalizatio n, we will turn to the regressionproblem. In this case, we will try to fit a polynomial of much smaller degree kto the p data points, with k << p.We can do this quite simply by r e moving the columns of the Vandermondematrix that we do not need- or equivalently, in the exercises we will write afunction, vander1.m that will build the appropriate p × k + 1 matrix, and thenwe have the matrix-vector equation:V a = y1The proof of two special cases are in the exercises. The general proof we will leave foryou to consider, but is usually done by induction in a way similar to the 3 × 3 case.174 CHAPTER 11. RADIAL BASIS FUNCTIONS−5 0 5−0.500.511.52Figure 11.1: An e xample of the wild oscillations one can get in interpolatingdata with a high degree polynomial. In this case, we are interp olating 11 datapoints (asterisks) with a degree 10 polynomial (dotted c urve) from the


View Full Document

Whitman MATH 350 - Radial Basis Functions

Documents in this Course
Load more
Download Radial Basis Functions
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Radial Basis Functions and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Radial Basis Functions 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?