MIT 9 520 - Regularization for Multi-Output Learning - D1949278

Home> Schools> Massachusetts Institute of Technology> Brain and Cognitive Sciences (9) > 9 520> Regularization for Multi-Output Learning

DOC PREVIEW

MIT 9 520 - Regularization for Multi-Output Learning

School name Massachusetts Institute of Technology

Course 9 520- Statistical Learning Theory and Applications

Pages 47

This preview shows page 1-2-3-22-23-24-45-46-47 out of 47 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 47 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 47 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 47 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 47 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 47 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 47 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 47 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 47 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 47 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 47 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Regularization for Multi-Output LearningLorenzo Rosasco9.520 Class 11March 10, 2010L. Rosasco Regularization for Multi-Output LearningAbout this classGoal In many practical problems, it is convenient tomodel the object of interest as a function withmultiple outputs.In machine learning, this problem typically goesunder the name of multi-task or multi-outputlearning. We present some concepts andalgorithms to solve this kind of problems.L. Rosasco Regularization for Multi-Output LearningPlanExamples and Set-upTikhonov regularization for multiple output learningRegularizers and KernelsVector FieldsMulticlassConclusionsL. Rosasco Regularization for Multi-Output LearningCostumers ModelingCostumers Modelingthe goal is to model buying preferences of several peoplebased on previous purchases.borrowing strengthPeople with similar tastes will tend to buy similar items and theirbuying history is related.The idea is then to predict the consumer preferences for allindividuals simultaneously by solving a multi-output learningproblem.Each consumer is modelled as a task and its previouspreferences are the corresponding training set.L. Rosasco Regularization for Multi-Output LearningMulti-task LearningWe are given T scalar tasks.For each task j = 1, . . . , T , we are given a set of examplesSj= (xji, yji)nji=1sampled i.i.d. according to a distribution Pt.The goal is to findft(x) ∼ y t = 1, . . . , T .L. Rosasco Regularization for Multi-Output LearningMulti-task LearningTask 1Task 2XXYL. Rosasco Regularization for Multi-Output LearningPharmacological DataBlood concentration of a medicine across different times. Eachtask is a patient.0 5 10 15 20 250204060Multi-task0 5 10 15 20 2502040600 5 10 15 20 2502040600 5 10 15 20 2502040600 5 10 15 20 25050100Time (hours) 0 5 10 15 20 250204060Single-task0 5 10 15 20 2502040600 5 10 15 20 2502040600 5 10 15 20 2502040600 5 10 15 20 25050100Time (hours)0 5 10 15 20 250204060Multi-task0 5 10 15 20 2502040600 5 10 15 20 2502040600 5 10 15 20 2502040600 5 10 15 20 25050100Time (hours) 0 5 10 15 20 250204060Single-task0 5 10 15 20 2502040600 5 10 15 20 2502040600 5 10 15 20 2502040600 5 10 15 20 25050100Time (hours)0 5 10 15 20 250204060#10 5 10 15 20 250204060#20 5 10 15 20 250204060#30 5 10 15 20 250204060#10 5 10 15 20 250204060#20 5 10 15 20 250204060#30 5 10 15 20 250204060Single-task0 5 10 15 20 250204060Multi-taskRed dots are test and black dots are training points.( pics from Pillonetto et al. 08)L. Rosasco Regularization for Multi-Output LearningNames and ApplicatonsRelated problems:conjoint analysistransfer learningcollaborative filteringco-krigingExamples of applications:geophysicsmusic recommendation (Dinuzzo 08)pharmacological data (Pillonetto at el. 08)binding data (Jacob et al. 08)movies recommendation (Abernethy et al. 08)HIV Therapy Screening (Bickel et al. 08)L. Rosasco Regularization for Multi-Output LearningMulti-task Learning: RemarksThe framework is very general.The input spaces can be different.The output space can be different.The hypotheses spaces can be differentL. Rosasco Regularization for Multi-Output LearningHow Can We Design an Algorithm?In all the above problems one can think of improvingperformances, by exploiting relation among the differentoutputs.A possible way to do this is penalized empirical riskminimizationminf1,...,fTERR[f1, . . . , fT] + λPEN(f1, . . . , fT)TypicallyThe error term is the sum of the empirical risks.The penalty term enforces similarity among the tasks.L. Rosasco Regularization for Multi-Output LearningError TermWe are going to choose the square loss to measure errors.ERR[f1, . . . , fT] =TXj=11njnXi=1(yji− fj(xji))2L. Rosasco Regularization for Multi-Output LearningMTLMTLLet fj: X → R, j = 1, . . . T thenERR[f1, . . . , fT] =TXj=1ISj[fj]withIS[f ] =1nnXi=1(yi− f (xi))2L. Rosasco Regularization for Multi-Output LearningBuilding RegularizersWe assume that input, output and hypotheses spaces are thesame, i.e.Xj= X ,Yj= Y ,andHj= H,for all j = 1, . . . , T .We also assume H to be a RKHS with kernel K .L. Rosasco Regularization for Multi-Output LearningRegularizers: Mixed EffectFor each component/task the solution is the same function plusa component/task specific component.PEN(f1, . . . , fT) = λTXj=1kfjk2K+ γTXj=1kfj−TXs=1fsk2KL. Rosasco Regularization for Multi-Output LearningRegularizers: Graph RegularizationWe can define a regularizer that, in addition to a standardregularization on the single components, forces stronger orweaker similarity through a T × T positive weight matrix M:PEN(f1, . . . , fT) = γTX`,q=1kf`− fqk2KM`q+ λTX`=1kf`k2KM``L. Rosasco Regularization for Multi-Output LearningRegularizers: clusterThe components/tasks are partitioned into c clusters:components in the same cluster should be similar.Letmr, r = 1, . . . , c, be the cardinality of each cluster,I(r), r = 1, . . . , c, be the index set of the components thatbelong to cluster c.PEN(f1, . . . , fT) = γcXr=1Xl∈I(r )||fl− fr||2K+ λcXr=1mr||fr||2Kwhere fr, , r = 1, . . . , c, is the mean in cluster c.L. Rosasco Regularization for Multi-Output LearningHow can we find a the solution?We have to solveminf1,...,fT{1nTXj=1nXi=1(yji− fj(xi))2+ λTXj=1kfjk2K+ γTXj=1kfj−TXs=1fsk2K}(we considered the first regularizer as an example).The theory of RKHS gives us a way to do this using what wealready know from the scalar case.L. Rosasco Regularization for Multi-Output LearningTikhonov RegularizationWe now show that for al the above penalties we can define asuitable RKHS with kernel Q (and re-index the sums in theerror term), so thatminf1,...,fT{TXj=11njnXi=1(yji− fj(xi))2+ λPEN(f1, . . . , fT)}can be written asminf ∈H{1nTnTXi=1(yi− f (xi, ti))2+ λkf k2Q}L. Rosasco Regularization for Multi-Output LearningKernels at RescueConsider a (joint) kernel Q : (X , Π) × (X , Π) → R, whereΠ = 1, . . . T is the index set of the output components.A function in the space isf (x, t) =XiQ((x, t), (xi, ti))ci,with normkf k2Q=Xi,jQ((xj, tj), (xi, ti))cicj.L. Rosasco Regularization for Multi-Output LearningA Useful Class of KernelsLet A be a T × T positive definite matrix and K a scalar kernel.Consider a kernel Q : (X , Π) × (X , Π) → R, defined byQ((x, t), (x0, t0)) = K (x, x0)At,t0.Then the norm of a function iskf k2Q=Xi,jK (xi, xj)Atitjcicj.L. Rosasco Regularization for Multi-Output LearningRegularizers and KernelsIf we fix t then ft(x) = f (t, x) is one of the task. The norm k · kQcan be related to the scalar

View Full Document