DOC PREVIEW
K-State CIS 830 - Artificial Neural Networks Discussion

This preview shows page 1-2-23-24 out of 24 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 24 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 24 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 24 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 24 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 24 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

PowerPoint PresentationSlide 2Slide 3Slide 4Slide 5Slide 6Slide 7Slide 8Slide 9Slide 10Slide 11Slide 12Slide 13Slide 14Slide 15Slide 16Slide 17Slide 18Slide 19Slide 20Slide 21Slide 22Slide 23Slide 24Kansas State UniversityDepartment of Computing and Information SciencesCIS 830: Advanced Topics in Artificial IntelligenceMonday, February 28, 2000William H. HsuDepartment of Computing and Information Sciences, KSUhttp://www.cis.ksu.edu/~bhsuReadings:“Modular and Hierarchical Learning Systems”, M. I. Jordan and R. Jacobs(Reference) Section 7,5, Mitchell(Reference) Lectures 21-22, CIS 798 (Fall, 1999)Artificial Neural Networks Discussion (4 of 4):Modularity in Neural Learning SystemsLecture 16Lecture 16Kansas State UniversityDepartment of Computing and Information SciencesCIS 830: Advanced Topics in Artificial IntelligenceLecture OutlineLecture Outline•Outside Reading–Section 7.5, Mitchell–Section 5, MLC++ manual, Kohavi and Sommerfield–Lectures 21-22, CIS 798 (Fall, 1999)•This Week’s Paper Review: “Bagging, Boosting, and C4.5”, J. R. Quinlan•Combining Classifiers–Problem definition and motivation: improving accuracy in concept learning–General framework: collection of weak classifiers to be improved•Examples of Combiners (Committee Machines)–Weighted Majority (WM), Bootstrap Aggregating (Bagging), Stacked Generalization (Stacking), Boosting the Margin–Mixtures of experts, Hierarchical Mixtures of Experts (HME)•Committee Machines–Static structures: ignore input signal–Dynamic structures (multi-pass): use input signal to improve classifiersKansas State UniversityDepartment of Computing and Information SciencesCIS 830: Advanced Topics in Artificial IntelligenceCombining ClassifiersCombining Classifiers•Problem Definition–Given•Training data set D for supervised learning•D drawn from common instance space X•Collection of inductive learning algorithms, hypothesis languages (inducers)–Hypotheses produced by applying inducers to s(D)•s: X vector  X’ vector (sampling, transformation, partitioning, etc.)•Can think of hypotheses as definitions of prediction algorithms (“classifiers”)–Return: new prediction algorithm (not necessarily  H) for x  X that combines outputs from collection of prediction algorithms•Desired Properties –Guarantees of performance of combined prediction–e.g., mistake bounds; ability to improve weak classifiers•Two Solution Approaches–Train and apply each inducer; learn combiner function(s) from result–Train inducers and combiner function(s) concurrentlyKansas State UniversityDepartment of Computing and Information SciencesCIS 830: Advanced Topics in Artificial IntelligenceCombining Classifiers:Combining Classifiers:Ensemble AveragingEnsemble Averaging•Intuitive Idea–Combine experts (aka prediction algorithms, classifiers) using combiner function–Combiner may be weight vector (WM), vote (bagging), trained inducer (stacking)•Weighted Majority (WM)–Weights each algorithm in proportion to its training set accuracy–Use this weight in performance element (and on test set predictions)–Mistake bound for WM•Bootstrap Aggregating (Bagging)–Voting system for collection of algorithms–Training set for each member: sampled with replacement–Works for unstable inducers (search for h sensitive to perturbation in D)•Stacked Generalization (aka Stacking)–Hierarchical system for combining inducers (ANNs or other inducers)–Training sets for “leaves”: sampled with replacement; combiner: validation set•Single-Pass: Train Classification and Combiner Inducers Serially•Static Structures: Ignore Input SignalKansas State UniversityDepartment of Computing and Information SciencesCIS 830: Advanced Topics in Artificial IntelligencePrinciple:Principle:Improving Weak ClassifiersImproving Weak ClassifiersMixtureModel653421First Classifier514263Second ClassifierBoth Classifiers634152Kansas State UniversityDepartment of Computing and Information SciencesCIS 830: Advanced Topics in Artificial IntelligenceFramework:Framework:Data Fusion and Mixtures of ExpertsData Fusion and Mixtures of Experts•What Is A Weak Classifier?–One not guaranteed to do better than random guessing (1 / number of classes)–Goal: combine multiple weak classifiers, get one at least as accurate as strongest•Data Fusion–Intuitive idea•Multiple sources of data (sensors, domain experts, etc.)•Need to combine systematically, plausibly–Solution approaches•Control of intelligent agents: Kalman filtering•General: mixture estimation (sources of data  predictions to be combined)•Mixtures of Experts–Intuitive idea: “experts” express hypotheses (drawn from a hypothesis space)–Solution approach (next time)•Mixture model: estimate mixing coefficients•Hierarchical mixture models: divide-and-conquer estimation methodKansas State UniversityDepartment of Computing and Information SciencesCIS 830: Advanced Topics in Artificial Intelligence•Weight-Based Combiner–Weighted votes: each prediction algorithm (classifier) hi maps from x  X to hi(x) –Resulting prediction in set of legal class labels–NB: as for Bayes Optimal Classifier, resulting predictor not necessarily in H•Intuitive Idea–Collect votes from pool of prediction algorithms for each training example–Decrease weight associated with each algorithm that guessed wrong (by a multiplicative factor)–Combiner predicts weighted majority label•Performance Goals–Improving training set accuracy•Want to combine weak classifiers•Want to bound number of mistakes in terms of minimum made by any one algorithm–Hope that this results in good generalization qualityWeighted Majority:Weighted Majority:IdeaIdeaKansas State UniversityDepartment of Computing and Information SciencesCIS 830: Advanced Topics in Artificial IntelligenceBagging:Bagging:IdeaIdea•Bootstrap Aggregating aka Bagging –Application of bootstrap sampling•Given: set D containing m training examples•Create S[i] by drawing m examples at random with replacement from D•S[i] of size m: expected to leave out 0.37 of examples from D–Bagging•Create k bootstrap samples S[1], S[2], …, S[k]•Train distinct inducer on each S[i] to produce k classifiers•Classify new instance by classifier vote (equal weights)•Intuitive Idea–“Two heads are better than one”–Produce multiple classifiers from one data set•NB: same inducer (multiple instantiations) or


View Full Document

K-State CIS 830 - Artificial Neural Networks Discussion

Download Artificial Neural Networks Discussion
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Artificial Neural Networks Discussion and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Artificial Neural Networks Discussion 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?