CMU BSC 03510 - Lecture - D2042614

Home> Schools> Carnegie Mellon University> Biological Sciences (BSC) > BSC 03510> Lecture

DOC PREVIEW

CMU BSC 03510 - Lecture

School name Carnegie Mellon University

Course Bsc 03510-

Pages 68

This preview shows page 1-2-3-4-5-32-33-34-35-64-65-66-67-68 out of 68 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 68 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 68 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 68 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 68 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 68 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 68 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 68 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 68 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 68 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 68 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 68 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 68 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 68 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 68 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 68 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Computational Biology, Part 7 Supervised Machine Learning and Searching for Sequence FamiliesSlide 2What is Machine Learning?Fundamental Question of Machine LearningWhy Machine Learning?Slide 6Successful Machine Learning ApplicationsMachine Learning ParadigmsSupervised LearningClassification vs. RegressionRepresentationFormal descriptionInductive learning hypothesisHypothesis spaceSlide 15k-Nearest Neighbor (kNN)Slide 17Slide 18Linear DiscriminantsDecision treesSlide 21Slide 22Slide 23Support vector machinesSupport Vector Machines (SVMs)Slide 26Slide 27Slide 28Slide 29Slide 30Slide 31Slide 32Cross-ValidationDescribing classifier errorsConfusion matrix - binaryPrecision-recall analysisSlide 37Confusion matrix – multi-classGround truthStating Goals vs. ApproachesSlide 41ResourcesSlide 43Goals for sequence familiesPossible ApproachesPSSMsLearning PSSMsPosition Specific Iterated BLAST (PSI-BLAST)Problems with PSSMsCobblingSlide 51Slide 52Cobbler IllustrationFamily Pairwise SearchSlide 55Which method is best?Comparison ProtocolEvaluationEvaluation metric - ROCExample of Evaluation for ROC2Protocol for Comparison of MethodsResultsConclusionComparison ProtocolWhich is best (part 2)?Slide 66Slide 67ConclusionsComputational Biology, Part 7Supervised Machine Learning and Searching for Sequence FamiliesComputational Biology, Part 7Supervised Machine Learning and Searching for Sequence FamiliesRobert F. MurphyRobert F. MurphyCopyright Copyright  2008-2009. 2008-2009.All rights reserved.All rights reserved.www.cs.cmu.edu/~tom/pubs/MachineLearning.pdfWhat is Machine Learning?What is Machine Learning?Fundamental Question of Computer Fundamental Question of Computer Science: How can we build machines that Science: How can we build machines that solve problems, and which problems are solve problems, and which problems are inherently tractable/intractable?inherently tractable/intractable?Fundamental Question of Statistics: What Fundamental Question of Statistics: What can be inferred from data plus a set of can be inferred from data plus a set of modeling assumptions, with what modeling assumptions, with what reliability?reliability?Tom Mitchell white paperFundamental Question of Machine LearningFundamental Question of Machine LearningHow can we build computer systems that How can we build computer systems that automatically improve with experience, and automatically improve with experience, and what are the fundamental laws that govern what are the fundamental laws that govern all learning processes?all learning processes?Tom MitchellTom MitchellTom Mitchell white paperWhy Machine Learning?Why Machine Learning?Learn relationships from large sets of complex Learn relationships from large sets of complex data: Data miningdata: Data miningPredict clinical outcome from testsPredict clinical outcome from testsDecide whether someone is a good credit riskDecide whether someone is a good credit riskDo tasks too complex to program by handDo tasks too complex to program by handAutonomous drivingAutonomous drivingCustomize programs to user needsCustomize programs to user needsRecommend book/movie based on previous likesRecommend book/movie based on previous likesTom Mitchell white paperWhy Machine Learning?Why Machine Learning?Economically efficientEconomically efficientCan consider larger data spaces and Can consider larger data spaces and hypothesis spaces than people canhypothesis spaces than people canCan formalize learning problem to explicitly Can formalize learning problem to explicitly identify/describe goals and criteriaidentify/describe goals and criteriaSuccessful Machine Learning ApplicationsSuccessful Machine Learning ApplicationsSpeech recognitionSpeech recognitionTelephone menu navigationTelephone menu navigationComputer visionComputer visionMail sortingMail sortingBio-surveillanceBio-surveillanceIdentifying disease outbreaksIdentifying disease outbreaksRobot controlRobot controlAutonomous drivingAutonomous drivingEmpirical scienceEmpirical scienceTom Mitchell white paperMachine Learning ParadigmsMachine Learning ParadigmsSupervised LearningSupervised LearningClassificationClassificationRegressionRegressionUnsupervised LearningUnsupervised LearningClusteringClusteringSemi-supervised LearningSemi-supervised LearningCotrainingCotrainingActive learningActive learningSupervised LearningSupervised LearningApproachesApproachesClassification (discrete predictions)Classification (discrete predictions)Regression (continuous predictions)Regression (continuous predictions)Common considerationsCommon considerationsRepresentation (Features)Representation (Features)Feature SelectionFeature SelectionFunctional formFunctional formEvaluation of predictive powerEvaluation of predictive powerClassification vs. RegressionClassification vs. RegressionIf I want to predict whether a patient will If I want to predict whether a patient will die from a disease within six months, that is die from a disease within six months, that is classificationclassificationIf I want to predict how long the patient will If I want to predict how long the patient will live, that is regressionlive, that is regressionRepresentationRepresentationDefinition of thing or things to be predictedDefinition of thing or things to be predictedClassification: Classification: classesclassesRegression: Regression: regression variableregression variableDefinition of things (Definition of things (instancesinstances) to make ) to make predictions forpredictions forIndividualsIndividualsFamiliesFamiliesNeighborhoods, etc.Neighborhoods, etc.Choice of descriptors (Choice of descriptors (featuresfeatures) to describe ) to describe different aspects of instancesdifferent aspects of instancesFormal descriptionFormal descriptionDefining Defining XX as a set of as a set of instancesinstances x x described described by by featuresfeaturesGiven training examples Given training examples D D from from XXGiven a Given a target function ctarget function c that maps that maps X-X->{0,1}>{0,1}Given a Given a hypothesis space Hhypothesis space HDetermine an hypothesis Determine an hypothesis hh in in HH such that such that h(x)h(x)==c(x) c(x) for all for all xx in in DDCourtesy Tom MitchellInductive learning hypothesisInductive learning hypothesisAny hypothesis found to approximate the Any

View Full Document