Unformatted text preview:

1Making sense of microarraysIsaac S KohaneHow I learned to love Bioinformatics…RNA expression detection chipsSchena M, et al. Proc Natl Acad Sci USA; 93: 10614 (1996).Entire issue. Nature Genetics, 21: supplement (Jan 1999).TissueRNATaggedwith fluorcDNA spotted on glass slide oroligonucleotides built on slideTissue underinfluenceorHow I learned to love Bioinformatics…Two Biochip TechnologiesAffymetrixP. Brown / StanfordHow I learned to love Bioinformatics…What is a microarray? Low cost. The cost should be such that at least hundreds of samplesbe meas urable within a typical NI H RO1 budget Commodity level workflow. The microarray should be commoditizedsuch that a routine set of procedures requiring no scientific judgmentcan be performed using standard equipment to obtain the neededmeasurement. Automation. The process of data acquisition should be completelyautomated so that after the biomaterial whether it is protein, RNA orDNA is loaded into the analytic pipeline, most of the steps are fullyautomated and those that are not automated can be done by a non-special ized technician. Form factor. The equipment required should easily fit into a standardlaboratory bench format and not require its own room.How I learned to love Bioinformatics…What is a microarray? (II) Translational friendliness. A clinical investigator does not have tounderstand molecular biology techniques in order to be able to providethe necess ary material s for the acquisition of the microa rray d at a. Identifiability. All items identified by the microarray technologywhether they are proteins or RNA species should be automaticallyidentified against standard ref erence nom enclatures. High Throughput. Hundred of patient samples can be processed withindays. Commodity level priced infrastructure. The technology equipment andbudget should be available to most biological and clinicalinvestigational laborat ories. Massively parallel measurements of the relevant analytes. That is, themembers of transcriptome, the members of the proteome, themetabolome or any other comprehensive measure of molecularphysiology.Can we build microarrays forproteomics?2How I learned to love Bioinformatics…How I learned to love Bioinformatics…Quantitative ProteomicsHow I learned to love Bioinformatics…How I learned to love Bioinformatics…How I learned to love Bioinformatics…So, do we have proteomic microarrays?How I learned to love Bioinformatics…Making sense of all the data The underlying dogma What data may be included in the data sets? Beyond genomic data Multiple scales and non-numeric measures3How I learned to love Bioinformatics…Earlier versions of the dogma What do you see in the night sky?How I learned to love Bioinformatics…ClassificationsNon-exclusive ExclusiveSupervised UnsupervisedHierarchical PartitionalTaxonomy of machinelearningHow I learned to love Bioinformatics…Phylogenetic-type treeRNA ExprGene 10.7 7.3RNA ExprGene 20.36.51.2 1.9RNA ExprGene 3Experiment 1Experiment 28.11.1 0.9Experiment 3Correlation CoefficientGene 2.88Gene 1Gene 3-.19 -.62Gene 2Gene 1Gene 3How I learned to love Bioinformatics…Phylogenetic-type tree /Correlation coefficient Similarity score computed for two genesover the same conditions, similar toPearson’s correlation coefficient Found redundant representations andsimilarly functioning genes cluster together(for S. cerevisiae) Also found 10 temporal clusters in 8613genes in the response of human fibroblaststo serum Suggests role for over 200 genes withpreviously unknown functionEisen MB, PNAS 1998;95(25):14863-8.Iyer VR, Science 1999;283(25):83-7.How I learned to love Bioinformatics…How I learned to love Bioinformatics…Self-organizing mapsRNA ExprGene 10.7 7.3RNA ExprGene 20.36.51.2 1.9RNA ExprGene 3Experiment 1Experiment 28.11.1 0.9Experiment 3Experiment 1Experiment 2Experiment 3Gene 1Gene 2Gene 34How I learned to love Bioinformatics…Relevance Networks Several algorithms have already been developed forknowledge discovery and data-mining of RNA expressiondata sets We are interested in finding networks of genes that arefunctionally clustered with little or no a priori knowledge(unsupervised learning) Relevance Networks are an approach to analyze thesedata sets Previously validated in the clinical laboratory resultdomainChildren’s Hospital, Patent Pending.Butte AJ, Kohane IS, Unsupervised Knowledge Discovery in Medical Databases Using Relevance Networks, SymposiaAMIA, 1999.How I learned to love Bioinformatics…How I learned to love Bioinformatics… Patients and cell lines are analyzed as cases Clinical parameters, laboratory tests, RNA expression, andsusceptibility to anti-cancer agents are all examplefeatures of those casesConstruction of Relevance Networks 1LabTest1134LabTest 23.74.5ClinicalParam 110599RNA ExprJ02923Susceptibilityto Anti-cancerAgent 1695178.1Patient, Cell Line,Time, etc.1380.77.42.13.3132 5.3 102How I learned to love Bioinformatics… For all pairs of features, we take overlapping values overthe cases and make a scatter plot of valuesConstruction of Relevance Networks 2Susceptibilityto Anti-cancerAgent 169517Lab Test 2LabTest1134LabTest 23.74.5ClinicalParam 110599RNA ExprJ02923Susceptibilityto Anti-cancerAgent 1695178.1Patient, Cell Line,Time, etc.1380.77.42.13.3132 5.3 102How I learned to love Bioinformatics…Construction of Relevance Networks 3 Perform a pairwise comparison between all features For each scatter plot, we fit a linear model and stored Correlation coefficient r Every feature is completely connected to every otherfeature by a linear model of varying qualityLab Test 2 169517r = 0.65Susceptibilityto Anti-cancerAgent 169517Lab Test 2How I learned to love Bioinformatics… r^2 = r2 * r / abs(r) Choose a thresholdr^2 to split thenetwork Drop links with r^2under threshold Breaks the completelyconnected network intoislands where connectionsare stronger than threshold Islands are what we call “relevance networks” Display graphically, with thick lines representing strongest linksConstruction of Relevance Networks 4Lab Test 1Lab Test 2r^2ClinicalParam 1Expressionof J02923ClinicalParam 2Susceptibility to169517r^2r^2r^2r^25How I


View Full Document

MIT 6 872 - Making sense of microarrays

Download Making sense of microarrays
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Making sense of microarrays and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Making sense of microarrays 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?