Machine Learning in Drug DesignCollaboratorsOutlineDrugs Typically Are…Example of BindingSo To Design a Drug:Molecule Binds Target But May:And Every Body is Different:Slide 9Places to use Machine LearningSlide 11Healthy vs. DiseaseIf We Could Sequence DNA Quickly and Cheaply, We Could:Problem: Can’t Sequence QuicklySlide 15Example of SNP DataProblem: SNPs are not GenesProblem: Even SNPs are CostlyWhy Machine Learning?Slide 20Decision Trees in One PicturePowerPoint PresentationNaïve Bayes in One PictureVoting ApproachTask: Predict Early Onset Disease From SNP DataResultsLessonsSlide 28Slide 29Typical Practice when Target Structure is UnknownAn Example of Structure LearningInductive Logic ProgrammingThe Logical Representation of a PharmacophoreBackground Knowledge IBackground knowledge IICentral Idea: Generalize by searching a latticeConformational modelPharmacophore descriptionExample 1: Dopamine agonistsPharmacophore identifiedExample II: ACE inhibitorsExperiment 1ACE pharmacophorePharmacophore discoveredExperiment 2Slide 46Example III: Thermolysin inhibitorsKey binding site interactionsInteractions made by inhibitorsPharmacophore IdentificationThermolysin ResultsThermolysin resultsExample IV: Antibacterial peptidesPharmacophore IdentifiedSlide 55Ongoing ILP developments (pharmacophores)Ongoing developments (Other)Machine Learning in Drug DesignMachine Learning in Drug DesignDavid PageDept. of Biostatistics and Medical Informatics and Dept. of Computer SciencesCollaboratorsCollaboratorsMichael WaddellPaul FinnAshwin SrinivasanJohn ShaughnessyBart BarlogieFrank ZhanStephen MuggletonArno SpatolaSean McIlwainBrian KayOutlineOutlineOverview of Drug DesignHow Machine Learning Fits Into the ProcessTarget Search: Single Nucleotide Polymorphisms (SNPs)Machine Learning from Feature VectorsDecision TreesSupport Vector MachinesVoting/EnsemblesPredicting Molecular Activity: Learning from StructureDrugs Typically Are…Drugs Typically Are…Small organic molecules that…Modulate disease by binding to some target protein…At a location that alters the protein’s behavior (e.g., antagonist or agonist).Target protein might be human (e.g., ACE for blood pressure) or belong to invading organism (e.g., surface protein of a bacterium).Example of BindingExample of BindingSo To Design a Drug:So To Design a Drug:Identify TargetProteinDetermineTarget SiteStructureSynthesize aMolecule thatWill BindKnowledge of proteome/genomeRelevant biochemical pathwaysCrystallography, NMRDifficult if Membrane-BoundImperfect modeling of structureStructures may change at bindingAnd even then…Molecule Binds Target But May:Molecule Binds Target But May:Bind too tightly or not tightly enough.Be toxic.Have other effects (side-effects) in the body.Break down as soon as it gets into the body, or may not leave the body soon enough.It may not get to where it should in the body (e.g., crossing blood-brain barrier).Not diffuse from gut to bloodstream.And Every Body is Different:And Every Body is Different:Even if a molecule works in the test tube and works in animal studies, it may not work in people (will fail in clinical trials).A molecule may work for some people but not others.A molecule may cause harmful side-effects in some people but not others.OutlineOutlineOverview of Drug DesignHow Machine Learning Fits Into the ProcessTarget Search: Single Nucleotide Polymorphisms (SNPs)Machine Learning from Feature VectorsDecision TreesSupport Vector MachinesVoting/EnsemblesPredicting Molecular Activity: Learning from StructurePlaces to use Machine LearningPlaces to use Machine LearningFinding target proteins.Inferring target site structure.Predicting who will respond positively/negatively.Places to use Machine LearningPlaces to use Machine LearningFinding target proteins.Inferring target site structure.Predicting who will respond positively/negatively.DiseasedHealthy Healthy vs. Disease Healthy vs. DiseaseIf We Could Sequence DNA Quickly and Cheaply, We Could:If We Could Sequence DNA Quickly and Cheaply, We Could:Sequence DNA of people taking a drug, and use ML to identify consistent differences between those who respond well and those who do not.Sequence DNA of cancer cells and healthy cells, and use ML to detect dangerous mutations… proteins these genes code for may be useful targets.Sequence DNA of people who get a disease and those who don’t, and use ML to determine genes related to succeptibility… proteins these genes code for may be useful targets.Problem: Can’t Sequence QuicklyProblem: Can’t Sequence QuicklyCan quickly test single positions where variation is common: Single Nucleotide Polymorphisms (SNPs).Can quickly test degree to which every gene is being transcribed: Gene Expression Microarrays (e.g., Affymetrix Gene Chips™).Can (moderately) quickly test which proteins are present in a sample (Proteomics).OutlineOutlineOverview of Drug DesignHow Machine Learning Fits Into the ProcessTarget Search: Single Nucleotide Polymorphisms (SNPs)Machine Learning from Feature VectorsDecision TreesSupport Vector MachinesVoting/EnsemblesPredicting Molecular Activity: Learning from StructureExample of SNP DataExample of SNP Data Person SNP 1 2 3 . . . CLASS Person 1 C T A G T T . . . old Person 2 C C A G C T . . . young Person 3 T T A A C C . . . old Person 4 C T G G T T . . . young . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Problem: SNPs are not GenesProblem: SNPs are not GenesIf we find a predictive SNP, it may not be part of a gene… we can only infer that the SNP is “near” a gene that may be involved in the disease.Even if the SNP is part of a gene, it may be another nearby gene that is the key gene.Problem: Even SNPs are CostlyProblem: Even SNPs are CostlyTypically cannot use all known SNPs.Can focus on a particular chromosome and area if knowledge permits that.Can use a scattering of SNPs, since SNPs that are very close together may be redundant… use one SNP per haplotype block, or region where recombination is
View Full Document