DOC PREVIEW
MIT 9 520 - Study Notes

This preview shows page 1-2-3-4-26-27-28-53-54-55-56 out of 56 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 56 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 56 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 56 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 56 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 56 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 56 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 56 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 56 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 56 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 56 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 56 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 56 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Hierarchical Learning Machines:Derived Kernels and the Neural ResponseAndre WibisonoLorenzo RosascoMIT 9.520 Class 16April 5, 2010A. Wibisono, L. Rosasco Derived Kernel and the Neural ResponseHierarchical/Deep LearningA. Wibisono, L. Rosasco Derived Kernel and the Neural ResponseAbout this classGoal: to introduce a mathematical counterpart to the visual cortexmodel described in the previous two lectures.Describe a recursive definition of a similarity kernel.Describe theoretical analyses.S. Smale, L. Rosasco, J. Bouvrie, A. Caponnetto, and T. Poggio.“Mathematics of the Neural Response”, Foundations of ComputationalMathematics (2010) 10: 67–91andJ. Bouvrie, T. Poggio, L. Rosasco, S. Smale. A. Wibisono. “Properties ofHierarchical Learning Machines”, in preparation.A. Wibisono, L. Rosasco Derived Kernel and the Neural ResponsePlan1Background2Derived Kernels and the Neural Response3Connection to Neuroscience4Extensions5Theoretical AnalysisA. Wibisono, L. Rosasco Derived Kernel and the Neural ResponseBiologically Inspired Hierarhical Learning MachinesHuman-Machine Comparison: Chomsky’s poverty of thestimulus argument: biological organisms can learn complexconcepts and tasks from extraordinarily small empiricalsamples.Hierarchical organization is the key? circuits found in thehuman brain facilitate robust learning from few examples viathe discovery of invariances, while promoting circuitmodularity and reuse of redundant sub-circuits, leading also togreater energy and space efficiency.A. Wibisono, L. Rosasco Derived Kernel and the Neural ResponseWhy Hierarchical Learning Machines?When and why is a hierarchical architecture preferred?1Invariance versus selectivity.2Computational properties.3Adaptive tuning.4Sample complexity.For tasks that can be decomposed into a hierarchy of parts, howcan we show that a supervised classifier trained using a hierarchicalfeature map will generalize better than an off-the-shelfnon-hierarchical alternative?A. Wibisono, L. Rosasco Derived Kernel and the Neural ResponseHierarchica Learning: Empirical Motivation0 2 4 6 8 10 12 14 16 18 20 220.10.20.30.40.50.60.70.80.9Sample Complexity: 9−class NN−Classification ExampleClassification AccuracyTraining Set Size L2 distance3−layer DK2−layer DK9-class digits problem, nearest neighbor classifier, Euclidean distance vs. 3-layerderived distance (u = 12, v = 20, 500 templates/layer, 3-pixel imagetranslations).A. Wibisono, L. Rosasco Derived Kernel and the Neural ResponsePlan1Background2Derived Kernels and the Neural Response3Connection to Neuroscience4Extensions5Theoretical AnalysisA. Wibisono, L. Rosasco Derived Kernel and the Neural ResponseTowards a TheoryWe will borrow concepts andoperations underlying thevisual cortex model.A. Wibisono, L. Rosasco Derived Kernel and the Neural ResponseDefining a modelThe ingredients needed to define the derived kernel consist of:A finite architecture of nested domains. We’ll call thempatches.A suitable family of function spaces defined on each patch.A set of transformations defined on patches.A set of templates which connect the mathematical model toa real world setting.A. Wibisono, L. Rosasco Derived Kernel and the Neural ResponseAn Architecture of PatchesWe first consider an architecture composed of three layers ofpatches: u, v and Sq in R2, with u ⊂ v ⊂ Sq,Figure: Nested patch domains.A. Wibisono, L. Rosasco Derived Kernel and the Neural ResponseImages as FunctionsWe consider a function space on Sq, denoted byIm(Sq) = {f : Sq → [0, 1]},as well as the function spaces Im(u), Im(v) defined on subpatchesu, v, respectively.Functions can be interpreted as grey scale images when workingwith a vision problem for example.A. Wibisono, L. Rosasco Derived Kernel and the Neural ResponseTransformationsNext, we assume a set Huoftransformations that are mapsfrom the smallest patch to thenext larger patchh : u → v.Similarly Hvwith h : v → Sq.The sets of transformations are assumed to be finite.These transformations act on the domain of a function (image).A. Wibisono, L. Rosasco Derived Kernel and the Neural ResponseExamplesExamples of transformations are primarily translations, but alsoscalings and rotations...Translations and Scalingswe have transformations of the form h = hβhαwithhα(x) = αx, and hβ(x0) = x0+ β,where α ∈ R and β ∈ R2is such that hβhα(u) ⊂ v.A. Wibisono, L. Rosasco Derived Kernel and the Neural ResponseInterpretationIn the vision interpretation, a translation h can be thought of asmoving the image over the “receptive field” vFigure: A transformation “restricts” an image to a specific patch.A. Wibisono, L. Rosasco Derived Kernel and the Neural ResponseTemplatesTemplate sets are finite,Tu⊂ Im(u) and Tv⊂ Im(v)they are image patchessampled from some set ofunlabeled images.link the mathematicaldevelopment to real worldproblems.The space of images can be endowed with a “mother” probabilitymeasure ρ. Templates can be seen as images frequentlyencountered in the early stages of life.A. Wibisono, L. Rosasco Derived Kernel and the Neural ResponseReproducing KernelGiven a set X, a function K : X × X → R is a reproducing kernelif it is a symmetric and positive definite kernel, i.e.nXi,j=1αiαjK(xi, xj) ≥ 0,for any n ∈ N, x1, . . . , xn∈ X and α1, . . . , αn∈ R.A. Wibisono, L. Rosasco Derived Kernel and the Neural ResponseDot Products and Feature mapConsider a featuremap:Φ : X → FInner product kernels are an instance of reproducing kernels:K(x, x0) = hΦ(x), Φ(x0)iis a reproducing kernel.A. Wibisono, L. Rosasco Derived Kernel and the Neural ResponseNormalizationWe assume K(x, x) 6= 0 for all x ∈ X and letbK(x, x0) =K(x, x0)pK(x, x)K(x0, x0).ClearlybK is a reproducing kernel andbK(x, x) ≡ 1 for all x ∈ X.Allows interpretation of and comparison between differentinstances.Is nice for correspondence with a distance.A. Wibisono, L. Rosasco Derived Kernel and the Neural ResponseOn the normalizationTo make sense of the normalization we rule out the functions suchthat K(f, f ) is zero.This assumption is quite natural in the context of images:If K(f, f ) is zero, the responses of f is identically zero at allpossible templates by definition:“one can’t see the contents of the image”.A. Wibisono, L. Rosasco Derived Kernel and the Neural ResponseDerived Kernel and Neural ResponseConstructionWe’ll give a bottom-up description of a three layer architecturebefore giving the general recursive definition.A.


View Full Document

MIT 9 520 - Study Notes

Documents in this Course
Load more
Download Study Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Study Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Study Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?