Biologically Inspired Model for Object RecognitionThe “Evolution”Neocognitron (Fukushima)Simple & ComplexNeocognitron (Fukushima)Then…What has evolved?Slide Number 7Slide Number 8Slide Number 9Slide Number 10InvariancePoolingArgu(1): position invarianceArgu(2): size invarianceArgu(3):neurophysiological data Argu(4):ExperimentArgu(5):SimulationSoftmax approximationStop for a while…Slide Number 20Implementation DetailsImplementation DetailsImplementation DetailsImplementation DetailsImplementation DetailsImplementation DetailsImplementation DetailsImplementation DetailsImplementation DetailsImplementation DetailsImplementation DetailsImplementation DetailsSlide Number 33Implementation DetailsModel Summary ExperimentExp. with clutter Slide Number 38Slide Number 39Slide Number 40Exp. without clutterSlide Number 42Exp. on Texture based objectsObject specific features or a universal dictionarySlide Number 45A unified system – looking at multiple processing levelsSlide Number 47Scene understanding taskImprovement? Thank you!Biologically Inspired Model for Object RecognitionXiaobai ChenCOS 598B23/31/2008The “Evolution” Fukushima, Biol. Cybernetics 80 Early attempts with neural network to mimic hierarchical model of Hubel & Wiesel 65’ Et al. Poggio, Nature Neuroscience 99 Max Vs. Sum pooling Et al. Poggio, PAMI 07 State-of-art neural network Extensive experiments33/31/2008Neocognitron (Fukushima) Hierarchical structure From “simple” to “complex”43/31/2008Simple & Complex Along the hierarchy, two functional stages are interleaved: Simple (S) units build an increasingly complex and specific representation by combining the response of several subunits with different selectivity Complex (C) units build an increasingly invariant representation (to position and scale) by combing the response of several subunits with the same selectivity but at slightly different position and scales53/31/2008Neocognitron (Fukushima) Hierarchical structure From “simple” to “complex” Increase invarianceThe basic idea and goals persists till now!63/31/2008Then…What has evolved? How much more we know about human visual system? For neural network model How to connect each layer? What is the computing model of each layer? How many layers?73/31/2008The hierarchy based on the brain modelEt al. Poggio 1999.83/31/2008ExperimentThe monkey was trained to recognize restricted set of views of unfamiliar target stimuli resembling paperclips. They check which IT cell responds best to all views. The cell that responded the most was picked for the study. Training stageCredit to Tomer Livne and Maria Zeldin93/31/2008Test stage:The best reaction of the cell was to the trained data.The second best was to new transformations of the trained object. And very little response to new objects (distractors)Credit to Tomer Livne and Maria Zeldin103/31/2008Quantitative resultsHierarchical models of object recognition in cortex. Reisenhuber and Poggio. Nature America Inc, november 1999.113/31/2008Invariance All kinds of invariance Translation Scale Rotation How to add invariance in the NN model? Pooling (Perrett & Oram 93’)123/31/2008Pooling Pooling over afferents tuned to various transformed versions of the same stimuli Two idealized mechanism Linear – Sum Suitable to increase complexity Nonlinear – Max SelectivityWhich is a better fit for the complex cell to achieve invariance?133/31/2008Argu(1): position invariance Both lead to position invariance Sum Specificity is lost Case-by-case parameter adjustments in clutter Max signal the best match of any part of the stimulus to the afferents’ preferred feature More robust in clutter143/31/2008Argu(2): size invariance Sum More afferents will be excited if the same object increases size; hence excitation of the cell will increase Max Cell response is determined by the best-matching afferent Not influenced much by more afferents153/31/2008Argu(3):neurophysiological data An IT neuron’s response seems to be dominated by the stimulus producing a higher firing rate Theoretical investigation on V1 also supports a MAX-like pooling mechanism (Sakai & Tanaka 97 )163/31/2008Argu(4):Experiment173/31/2008Argu(5):Simulation183/31/2008Softmax approximation P=0, linear sum P-> ,MAX193/31/2008Stop for a while… NN can really look like the visual pathway Alternative Max + Sum seems to work for a NN Recognition of different transformations of an object is similar to the problem of classification Use NN to learn features and do classification with linear classifiers (et al. Poggio 07)203/31/2008The hierarchy based on the brain modelEt al. Poggio 1999.213/31/2008Implementation Details Along the hierarchy, from V1 to IT, two functional stages are interleaved: Simple (S) units build an increasingly complex and specific representation by combining the response of several subunits with different selectivity with TUNING operation. Complex (C) units build an increasingly invariant representation (to position and scale) by combing the response of several subunits with the same selectivity but at slightly different position and scales with a MAX-like operation.223/31/2008Implementation DetailsCredit to Serre and Poggio233/31/2008Implementation Details By interleaving these two operation, an increasingly complex and invariant representation is built. Two routes: Main route follows the hierarchy of cortical stages strictly. Bypass route skip some of the stages Bypass routes may help provide richer vocabulary of shape-tuned units with different levels of complexity and invariance.Credit to Serre and Poggio243/31/2008Implementation Details S1units: Correspond to the classical simple cells of Hubel and Wiesel found in the primary visual cortex (V1) S1units take the form of Gabor functionsθθθθλπσγcossin and sincos)2cos()2)(exp(),(000220220yxyyxxxyxyxf+−=+=×+−=The aspect ratio: The orientation:The effective width: The wavelength:γθσλCredit to Serre and Poggio253/31/2008Implementation Details Perform TUNING operation between the incoming pattern of input x and there weight vector w. The response of a S1unit is maximal when x matches w exactly.Credit to Serre and Poggio263/31/2008Implementation Details C1units: Corresponds to
View Full Document