Unformatted text preview:

Machine Learning ! ! ! ! !Srihari 1 Linear Models for Regression Sargur Srihari [email protected] Learning ! ! ! ! !Srihari 2 Linear Regression with One Input • Simplest form of linear regression: – linear function of single input variable – y(x,w) = w0+w1x • More useful class of functions: – Polynomial curve fitting – y(x,w) = w0+w1x+w2x2+…=Σ wixi – linear combination of non-linear functions of input variables φi(x) instead of xi called basis functions • Linear functions of parameters (which gives them simple analytical properties), yet are nonlinear with respect to input variables Task is to learn weights w0, w1 from data D={(yi,xi)}i=1,,NMachine Learning ! ! ! ! !Srihari 3 Plan of Discussion • Discuss supervised learning starting with regression • Goal: predict value of one or more target variables t • Given d-dimensional vector x of input variables • Terminology – Regression • When t is continuous-valued – Classification • if t has a value consisting of labels (non-ordered categories) – Ordinal Regression • Discrete values, ordered categoriesMachine Learning ! ! ! ! !Srihari 4 Regression with multiple inputs • Polynomial – Single input variable scalar x • Generalizations – Predict value of continuous target variable t given value of d input variables x=[x1,..xd] – t can also be a set of variables (multiple regression) – Linear functions of adjustable parameters • Specifically linear combinations of nonlinear functions of input variable Target Variable Input VariableMachine Learning ! ! ! ! !Srihari 5 Simplest Linear Model with d inputs • Regression with d input variables y(x,w) = w0+w1x1+..+wd xd = wTx where x=(x1,..,xd)T are the input variables • Called Linear Regression since it is a linear function of – parameters w0,..,wd – input variables x1,..,xd • Significant limitation since it is a linear function of input variables – In the one-dimensional case this amounts a straight-line fit (degree-one polynomial) – y(x,w) = w0 + w1x This differs from Linear Regression with one variable and Polynomial Reg with one variableMachine Learning ! ! ! ! !Srihari Fitting a Regression Plane • Assume t is a function of inputs x1, x2,...xd (independent variables). Goal is to find the best linear regressor of t on all the inputs. • For d =2: fitting a plane through N input samples or a hyperplane in d dimensions. 6 x1 x2 t!Machine Learning ! ! ! ! !Srihari Learning to Rank Problem • LeToR • Multiple Inputs • Target Value – t is discrete (eg, 1,2..6) in training set but a continuous value in [1,6] is learnt and used to rank objects 7Machine Learning ! ! ! ! !Srihari Regression with multiple inputs: LeToR – Log frequency of query in anchor text – Query word in color on page – # of images on page – # of (out) links on page – PageRank of page – URL length – URL contains “~” – Page length Input (xi): (d Features of Query-URL pair) Output (y): Relevance Value In LETOR 4.0 dataset 46 query-document features Maximum of 124 URLs/query (d >200) Yahoo! data set has d=700 Target Variable - Point-wise (0,1,2,3) - Regression returns continuous value - Allows fine-grained ranking of URLs Traditional IR uses TF/IDFMachine Learning ! ! ! ! !Srihari Input Features Feature List of Microsoft Learning to Rank Datasets feature id feature description stream comments 1 covered query term number body 2 anchor 3 title 4 url 5 whole document 6 covered query term ratio body 7 anchor 8 title 9 url 10 whole document 11 stream length body 12 anchor 13 title 14 url 15 whole document 16 IDF(Inverse document frequency) body 17 anchor 18 title 19 url 20 whole document 21 sum of term frequency body 22 anchor 23 title 24 url 25 whole document 26 min of term frequency body 27 anchor 28 title 29 url 30 whole document 31 max of term frequency body 32 anchor 33 title 34 url 35 whole document 36 mean of term frequency body 37 anchor 38 title 39 url 40 whole document 41 variance of term frequency body 42 anchor 43 title 44 url 45 whole document 46 sum of stream length normalized term frequency body 47 anchor 48 title 49 url 50 whole document See http://research.microsoft.com/en-us/projects/mslr/feature.aspx IDF(t,D)=log N / |{d ∈ D: t ∈d}|!Machine Learning ! ! ! ! !Srihari Feature Statistics • Most of 46 features are normalized as continuous values from 0 to 1, exception some features are all 0s’. Feature! 1! 2! 3! 4! 5! 6! 7! 8! 9! 10! 11! 12! 13! 14! 15!Min! 0! 0! 0! 0! 0! 0! 0! 0! 0! 0! 0! 0! 0! 0! 0!Max! 1! 1! 1! 1! 1! 0! 0! 0! 0! 0! 1! 1! 1! 1! 1!Mean! 0.254! 0.1598! 0.1392! 0.2158! 0.1322! 0.1614! 0! 0! 0! 0! 0! 0.2841! 0.1382! 0.2109! 0.1218!16! 17! 18! 19! 20! 21! 22! 23! 24! 25! 26! 27! 28! 29! 30! 31!0! 0! 0! 0! 0! 0! 0! 0! 0! 0! 0! 0! 0! 0! 0! 0!1! 1! 1! 1! 1! 1! 1! 1! 1! 1! 1! 1! 1! 1! 1! 1!0.2879! 0.1533! 0.2258! 0.3057! 0.3332! 0.1534! 0.5473! 0.5592! 0.5453! 0.5622! 0.1675! 0.1377! 0.1249! 0.126! 0.2109! 0.1705!32! 33! 34! 35! 36! 37! 38! 39! 40! 41! 42! 43! 44! 45! 46!0! 0! 0! 0! 0! 0! 0! 0! 0! 0! 0! 0! 0! 0! 0!1! 1! 1! 1! 1! 1! 1! 1! 1! 1! 1! 0! 1! 1! 1!0.1694! 0.1603! 0.1275! 0.0762! 0.0762! 0.0728! 0.5479! 0.5574! 0.5502! 0.5673! 0.4321! 0.3361! 0! 0.1065! 0.1211!Machine Learning ! ! ! ! !Srihari 11 Linear Regression with M Basis Functions • Extended by considering nonlinear functions of input variables – where φj(x) are called Basis functions – We now need M weights for basis functions instead of d weights for features – Can be written as – where w=(w0,w1,..,wM-1) and Φ =(φ0,φ1,..,φM-1)T • Basis functions allow non-linearity with d input variables ∑−=+=110)x()wx,(Mjjjwwyφ  y(x, w) = wjφj(x)j= 0M −1∑= wTφ(x)Machine Learning ! ! ! ! !Srihari • Linear Basis Function Model • Polynomial Basis (for single variable x) φj(x)=x j with degree M-1 polynomial • Disadvantage – Global: • changes in one region of input space affects others – Difficult to formulate • Number of polynomials increases exponentially with M – Can divide input space into regions • use different polynomials in each region: • equivalent to spline functions 12 Polynomial Basis


View Full Document

UB CSE 574 - Linear Models for Regression

Download Linear Models for Regression
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Linear Models for Regression and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Linear Models for Regression 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?