Computational Biology, Part 22 Automated Interpretation of Subcellular Patterns in Microscope Images IIRepresenting distributions using Fourier transformsFrequency representationSlide 4Slide 5Slide 6Slide 7Demonstration spreadsheetMATLAB demonstrationUnsupervised Learning to Identify High-Resolution Protein PatternsLocation ProteomicsSlide 12Slide 13Slide 14Slide 15Slide 16Slide 17Slide 18Slide 19Slide 20Decomposing (unmixing) complex patternsDecomposing mixture patternsObject type determinationCluster Number SelectionExample of Object TypesUnmixing: Learning strategySlide 27Two-stage Strategy for unmixing unknown imageComputational Biology, Part 22Automated Interpretation of Subcellular Patterns in Microscope Images IIComputational Biology, Part 22Automated Interpretation of Subcellular Patterns in Microscope Images IIRobert F. MurphyRobert F. MurphyCopyright Copyright 1996, 1999, 2000-2008. 1996, 1999, 2000-2008.All rights reserved.All rights reserved.Representing distributions using Fourier transformsRepresenting distributions using Fourier transformsFrequency representationFrequency representationAny signal may be represented as the sum of Any signal may be represented as the sum of many sinusoids.many sinusoids.As more sinusoids are added to the sum, the As more sinusoids are added to the sum, the representation of the original signal becomes representation of the original signal becomes more and more accurate.more and more accurate.Frequency representationFrequency representationOn the left below is a square wave.On the left below is a square wave.On the right is a single sinusoid with a DC offset On the right is a single sinusoid with a DC offset which begins to approximate the original data.which begins to approximate the original data.Frequency representationFrequency representationNow, a second sinusoid is added to the first to Now, a second sinusoid is added to the first to create a better approximation.create a better approximation.The summation may be seen by noting how the The summation may be seen by noting how the first sinusoid is raised and lowered depending on first sinusoid is raised and lowered depending on whether the second is positive or negative.whether the second is positive or negative.Frequency representationFrequency representationAdding still another sinusoid further improves the Adding still another sinusoid further improves the approximation.approximation.Frequency representationFrequency representationAny discrete distribution can be represented in a Any discrete distribution can be represented in a completely reversible manner (to numerical completely reversible manner (to numerical accuracy) by as many sinusoids as there are points accuracy) by as many sinusoids as there are points in the distributionin the distributionDemonstration spreadsheetDemonstration spreadsheetDemoC3.xlsDemoC3.xlsMATLAB demonstrationMATLAB demonstrationfftillustrator.mfftillustrator.mUnsupervised Learning to Identify High-Resolution Protein PatternsUnsupervised Learning to Identify High-Resolution Protein PatternsLocation ProteomicsLocation ProteomicsTagTag many proteins many proteinscDNA taggingcDNA taggingPut individual cDNAs into GFP tagging vector (puts GFP coding at Put individual cDNAs into GFP tagging vector (puts GFP coding at end)end)Transfect individual clones with each tagged cDNATransfect individual clones with each tagged cDNACD-tagging CD-tagging (developed by (developed by Jonathan Jarvik and Peter BergetJonathan Jarvik and Peter Berget): ): Infect population of cells with a retrovirus carrying DNA sequence Infect population of cells with a retrovirus carrying DNA sequence that will “tag” in a random gene in each cellthat will “tag” in a random gene in each cellIsolate separate Isolate separate clonesclones, each of which produces express one tagged , each of which produces express one tagged proteinproteinUse RT-PCR to Use RT-PCR to identify tagged geneidentify tagged gene in each clone in each cloneCollect Collect many live cell images many live cell images for each clone using for each clone using spinning disk confocal fluorescence microscopy or spinning disk confocal fluorescence microscopy or automated high-throughput microscopyautomated high-throughput microscopyImages of CD-tagged 3T3 cellsSLF features can be used to measure similarity of SLF features can be used to measure similarity of protein patternsprotein patternsThis allows us for the first time to create a This allows us for the first time to create a systematic, objective, framework for describing systematic, objective, framework for describing subcellular locations: a subcellular locations: a Subcellular Location Subcellular Location TreeTreeStart by grouping two proteins whose patterns are Start by grouping two proteins whose patterns are most similar, keep adding branches for less and most similar, keep adding branches for less and less similar patternsless similar patternsChen et al 2003;Chen and Murphy 2005Protein nameHuman descriptionFrom databaseshttp://murphylab.web.cmu.edu/services/PSLID/tree.htmlNucleolar ProteinsPunctate Nuclear ProteinsPredominantly Nuclear Proteins with Some Punctate Cytoplasmic StainingNuclear and Cytoplasmic Proteins with Some Punctate StainingUniformBottom: Visual Assignment to “known” locationsTop: Automated Grouping and AssignmentProtein namehttp://murphylab.web.cmu.edu/services/PSLID/tree.htmlDecomposing (unmixing) complex patternsDecomposing (unmixing) complex patternsDecomposingmixture patternsDecomposingmixture patternsClustering or classifying whole cell patterns Clustering or classifying whole cell patterns will consider each combination of two or will consider each combination of two or more “basic” patterns as a unique new more “basic” patterns as a unique new patternpatternDesirable to have a way to Desirable to have a way to decomposedecompose mixtures insteadmixtures insteadOne approach would be to assume that each One approach would be to assume that each basic pattern has a recognizable basic pattern has a recognizable combination of combination of different types of objectsdifferent types of objectsObject type determinationObject type determinationRather than specifying object types, we can Rather than specifying object types, we can choose to learn them from the datachoose to learn them from the dataUse subset of SLFs to describe
View Full Document