K-State CIS 830 - Artificial Neural Networks Discussion

Unformatted text preview:

PowerPoint PresentationSlide 2Slide 3Slide 4Slide 5Slide 6Slide 7Slide 8Slide 9Slide 10Slide 11Slide 12Slide 13Slide 14Slide 15Slide 16Slide 17Slide 18Kansas State UniversityDepartment of Computing and Information SciencesCIS 830: Advanced Topics in Artificial IntelligenceWednesday, February 23, 2000William H. HsuDepartment of Computing and Information Sciences, KSUhttp://www.cis.ksu.edu/~bhsuReadings:“The Wake-Sleep Algorithm for Unsupervised Neural Networks”, Hinton et al(Reference) Section 6.12, Mitchell(Reference) Section 3.2.4-3.2.5, Shavlik and DietterichArtificial Neural Networks Discussion (3 of 4):Unsupervised Learning and Pattern RecognitionLecture 16Lecture 16Kansas State UniversityDepartment of Computing and Information SciencesCIS 830: Advanced Topics in Artificial IntelligenceLecture OutlineLecture Outline•Readings: “The Wake-Sleep Algorithm”, Hinton et al•Suggested Reading: 6.12, Mitchell; Rumelhart and Zipser; Kohonen•This Week’s Reviews: Wake-Sleep, Hierarchical Mixtures of Experts•Unsupervised Learning and Clustering–Definitions and framework–Constructive induction•Feature construction•Cluster definition–EM, AutoClass, Principal Components Analysis, Self-Organizing Maps•Expectation-Maximization (EM) Algorithm–More on EM and Bayesian Learning–EM and unsupervised learning•Next Lecture: Time Series Learning–Intro to time series learning, characterization; stochastic processes–Read Chapter 19, Russell and Norvig (neural and Bayesian computation)Kansas State UniversityDepartment of Computing and Information SciencesCIS 830: Advanced Topics in Artificial IntelligenceUnsupervised Learning:Unsupervised Learning:ObjectivesObjectives•Unsupervised Learning–Given: data set D•Vectors of attribute values (x1, x2, …, xn)•No distinction between input attributes and output attributes (class label)–Return: (synthetic) descriptor y of each x•Clustering: grouping points (x) into inherent regions of mutual similarity•Vector quantization: discretizing continuous space with best labels•Dimensionality reduction: projecting many attributes down to a few•Feature extraction: constructing (few) new attributes from (many) old ones•Intuitive Idea–Want to map independent variables (x) to dependent variables (y = f(x))–Don’t always know what “dependent variables” (y) are–Need to discover y based on numerical criterion (e.g., distance metric)SupervisedLearningUnsupervisedLearningyf(x)x xfˆxKansas State UniversityDepartment of Computing and Information SciencesCIS 830: Advanced Topics in Artificial IntelligenceClusteringClustering•A Mode of Unsupervised Learning–Given: a collection of data points–Goal: discover structure in the data•Organize data into sensible groups (how many here?)•Criteria: convenient and valid organization of the data•NB: not necessarily rules for classifying future data points–Cluster analysis: study of algorithms, methods for discovering this structure•Representing structure: organizing data into clusters (cluster formation)•Describing structure: cluster boundaries, centers (cluster segmentation)•Defining structure: assigning meaningful names to clusters (cluster labeling)•Cluster: Informal and Formal Definitions–Set whose entities are alike and are different from entities in other clusters–Aggregation of points in the instance space such that distance between any two points in the cluster is less than the distance between any point in the cluster and any point not in itKansas State UniversityDepartment of Computing and Information SciencesCIS 830: Advanced Topics in Artificial IntelligenceQuick Review:Quick Review:Bayesian Learning and EMBayesian Learning and EM•Problem Definition–Given: data (n-tuples) with missing values, aka partially observable (PO) data–Want to fill in ? with expected value•Solution Approaches–Expected = distribution over possible values–Use “best guess” Bayesian model (e.g., BBN) to estimate distribution–Expectation-Maximization (EM) algorithm can be used here•Intuitive Idea–Want to find hML in PO case (D  unobserved variables  observed variables)–Estimation step: calculate E[unobserved variables | h], assuming current h–Maximization step: update wijk to maximize E[lg P(D | h)], D  all variables  jjeEjjeE,nNHhHhMLXIXImaxarge withcases data #e ,n withcases data #maxarghKansas State UniversityDepartment of Computing and Information SciencesCIS 830: Advanced Topics in Artificial IntelligenceEM for Unsupervised LearningEM for Unsupervised Learning•Unsupervised Learning Problem–Objective: estimate a probability distribution with unobserved variables–Use EM to estimate mixture policy (more on this later; see 6.12, Mitchell)•Pattern Recognition Examples–Human-computer intelligent interaction (HCII)•Detecting facial features in emotion recognition•Gesture recognition in virtual environments–Computational medicine [Frey, 1998]•Determining morphology (shapes) of bacteria, viruses in microscopy•Identifying cell structures (e.g., nucleus) and shapes in microscopy–Other image processing–Many other examples (audio, speech, signal processing; motor control; etc.)•Inference Examples–Plan recognition: mapping from (observed) actions to agent’s (hidden) plans–Hidden changes in context: e.g., aviation; computer security; MUDsKansas State UniversityDepartment of Computing and Information SciencesCIS 830: Advanced Topics in Artificial IntelligenceUnsupervised Learning:Unsupervised Learning:Competitive Learning for Feature DiscoveryCompetitive Learning for Feature Discovery•Intuitive Idea: Competitive Mechanisms for Unsupervised Learning–Global organization from local, competitive weight update•Basic principle expressed by Von der Malsburg•Guiding examples from (neuro)biology: lateral inhibition–Previous work: Hebb, 1949; Rosenblatt, 1959; Von der Malsburg, 1973; Fukushima, 1975; Grossberg, 1976; Kohonen, 1982•A Procedural Framework for Unsupervised Connectionist Learning–Start with identical (“neural”) processing units, with random initial parameters–Set limit on “activation strength” of each unit–Allow units to compete for right to respond to a set of inputs•Feature Discovery–Identifying (or constructing) new features relevant to supervised learning–Examples: finding distinguishable letter


View Full Document

K-State CIS 830 - Artificial Neural Networks Discussion

Download Artificial Neural Networks Discussion
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Artificial Neural Networks Discussion and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Artificial Neural Networks Discussion 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?