DOC PREVIEW
Invariance and Selectivity in the Ventral Visual Pat

This preview shows page 1-2-15-16-17-32-33 out of 33 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 33 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 33 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 33 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 33 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 33 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 33 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 33 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 33 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Invariance and Selectivityin the Ventral Visual PathwayStuart GemanDivision of Applied MathematicsBrown UniversityProvidence, Rhode Island 02912USAAbstractPattern recognition systems that are invariant to shape, pose, lighting and textureare never sufficiently selective; they suffer a high rate of “false alarms”. How arebiological vision systems both invariant and selective? Specifically, how are properarrangements of sub-patterns distinguished from the chance arrangements that defeatselectivity in artificial systems? The answer may lie in the nonlinear dynamics thatcharacterize complex and other invariant cell types: these cells are temporarily morereceptive to some inputs than to others (functional connectivity). One consequence isthat pairs of such cells with overlapping receptive fields will possess a related propertythat might be termed functional common input. Functional common input wouldinduce high correlation exactly when there is a match in the sub-patterns appearingin the overlapping receptive fields. These correlations, possibly expressed as a partialand highly local synchrony, would preserve the selectivity otherwise lost to invariance.Keywords: correlation, synchrony, microcircuitry, nonlinearity, binding, vision1 IntroductionPractical computer vision-systems answer practical questions: Is there a license platein the image? What is the license plate number? Is there a defect in the chip geometry?How many faces are in the image? Who is in the image? Biological vision systems areless oriented towards a single question or set of questions and more oriented towards anongoing process of image analysis. Indeed, real-world images have essentially infinitedetail, which can be perceived only by a process that is itself ongoing and essentiallyinfinite. The more you look, the more you see.The implications of these remarks for biological vision systems are controversial.One extreme viewpoint is that, when faced with a complex image, brains construct anever more elaborate data structure that simultaneously represents the richness of sceneconstituents and their inter-relationships. Scene analysis is the process of buildingsomething akin to a complex molecule whose atoms and bonds represent the multitudeof constituents and relationships, possibly at a multitude of resolutions, that we per-ceive and reason about. This would be in the spirit of proposals by von der Malsbug[58] and Bienenstock [10, 11], and consistent with Grenander’s proposition that pat-terns, in general, are best formulated as a relational composition of parts (Grenander[23], see also Fu [18]). At another extreme is the searchlight metaphor, whereby the1primary visual cortex serves as a kind of high-resolution buffer, and whereby imageanalysis is a process of selectively identifying parts in selected (attended) sub-regions.The process yields an annotated scene, “tree here, car there”, through a highly directedsearch involving sequential and selective attention. This is more like models suggestedby Treisman and Gelade [51], Marr [33], or Crick [13].I propose to examine these fundamental biological questions from the perspectiveof the science of computer vision. This might appear misguided, given the evidentshortcomings of artificial vision systems. But I would argue that the combination ofgreat effort and modest progress in computer vision has in fact produced an importantresult: we know much more about what makes vision a hard problem than we did,say, twenty years ago. What exactly are the limitations of engineered vision systems?Where do they break down, and why? I contend that the basic limitations can bewell articulated and that they lead to well focused questions that should be asked ofbiological vision systems.As a preview, and as an introduction to the state of the art in computer vision,consider the practical problem of reading the identifying characters used to track wafersin semiconductor manufacturing. This is an example of the much-studied OCR (opticalcharacter recognition) problem. The highly automated semiconductor industry is theleading consumer of machine-vision products, with a broad range of applications wherea computer equipped with a camera performs repetitive functions that are essentialfor a low-tolerance high-yield throughput. The OCR problem for wafer tracking isevidently difficult: many equipment manufacturers compete for a performance edge,yet the state of the art remains substantially short of human performance. This isdespite best efforts to use neural networks, the latest developments in learning theory,or the latest techniques in pattern classification. Some years ago I worked on a teamthat developed a state-of-the-art reader for this application. Although the reader hasbeen installed in over six thousand wafer-tracking machines, it is no exception to therule that computer vision, even for constrained problems in controlled environments,is not yet competitive with human vision.The difficulties begin with the patterned geometries that surround and often over-lap with the identification markings (Figure 1), and they are compounded by othervariables of presentation, including specularity and fluctuating contrast. Humans ac-commodate all of this effortlessly. In contrast, computer programs that can cope withthe variability of the presentations of the characters suffer from “false alarms” (false de-2tections) between or overlapping the real characters, or in the structured backgrounds.Conversely, programs that are more selective (few or no false alarms) inevitably misscharacters or make substitution errors. I propose that this dilemma of invarianceversus selectivity is a central challenge in computer vision, and that unraveling themechanism of its solution is a central challenge in understanding biological vision.Figure 1: OCR for wafer tracking. Computer programs that accommodate the variability ofcharacter presentations are prone to false detections in the structured backgrounds.Indeed, invariance is ubiquitous in biological vision systems. By computer visionstandards, perception is astonishingly robust to coloring, texturing and contrast, aswell as to pose parameters which define a nearly infinite-dimensional manifold fordeformable objects. There is plenty of evidence for nearly invariant representations inthe nervous system, starting with retinal circuits that perform nearly contrast-invariantcalculations, through complex cells of V1 and V2 that exhibit some invariance


Invariance and Selectivity in the Ventral Visual Pat

Download Invariance and Selectivity in the Ventral Visual Pat
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Invariance and Selectivity in the Ventral Visual Pat and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Invariance and Selectivity in the Ventral Visual Pat 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?