CORNELL CS 726 - RATIO-BASED DECISIONS AND THE QUANTITATIVE ANALYSIS OF CDNA MICROARRAY IMAGES

Unformatted text preview:

RATIO-BASED DECISIONS AND THEQUANTITATIVE ANALYSIS OF CDNAMICROARRAY IMAGESYidong Chen,†Edward R. Dougherty,‡and Michael L. Bittner††National Institutes of Health, National Human Genome Research Institute, Bethesda, Maryland20892;‡Texas A&M University, Texas Center for Applied Technology and Department of ElectricalEngineering, College Station, Texas 77843(Paper JBO-150 received Mar. 24, 1997; revised manuscript received June 23, 1997; accepted for publication July 8, 1997. )ABSTRACTGene expression can be quantitatively analyzed by hybridizing fluor-tagged mRNA to targets on a cDNAmicroarray. Comparison of gene expression levels arising from cohybridized samples is achieved by takingratios of average expression levels for individual genes. A novel method of image segmentation is providedto identify cDNA target sites and a hypothesis test and confidence interval is developed to quantify thesignificance of observed differences in expression ratios. In particular, the probability density of the ratio andthe maximum-likelihood estimator for the distribution are derived, and an iterative procedure for signalcalibration is developed.© 1997 Society of Photo-Optical Instrumentation Engineers. [S1083-3668(97)00504-2]Keywords cDNA; microarray; gene expression; image segmentation; Mann–Whitney target detection; ratiodensity, ratio confidence interval.1INTRODUCTIONThe recent development of complementary DNAmicroarray technology provides a powerful analyti-cal tool for human genetic research.1One of its ba-sic applications is to quantitatively analyze fluores-cence signals that represent the relative abundanceof mRNA from two distinct tissue samples. cDNAmicroarrays are prepared by automatically printingthousands of cDNAs in an array format on glassmicroscope slides, which provide gene-specific hy-bridization targets. Two different samples (ofmRNA) can be labeled with different fluors andthen cohybridized onto each arrayed gene. Ratios ofgene expression levels between the samples are cal-culated and used to detect meaningfully differentexpression levels between the samples for a givengene.This paper studies ratio distributions and devel-ops a hypothesis test and confidence interval sothat expression ratios may be used for deciding sig-nificant differences in sample expressions acrossthe gene population discernible on a microarray.Assuming sample expression levels are indepen-dent, levels are normally distributed, and there is aconstant coefficient of variation for the entire geneset (a biochemical consequence of the mechanics oftranscript production), we derive the probabilitydensity of the ratio, find the maximum-likelihoodestimator for the distribution, and develop an itera-tive procedure for signal calibration. Under theaforementioned conditions, we can process a singleimage and identify outliers. Expression measure-ments are achieved by processing digitized mi-croarray images, the key imaging development be-ing a nonparametric statistical technique to extractcDNA sites on the slide.2BIOLOGICAL BACKGROUND AND CDNAMICROARRAY TECHNOLOGYA cell relies on its protein components for a widevariety of its functions. The production of energy,the biosynthesis of all component macromolecules,the maintenance of cellular architecture, and theability to act upon intra- and extracellular stimuliare all protein dependent. Each cell within an or-ganism contains the information necessary to pro-duce the entire repertoire of proteins which that or-ganism can specify. This information is stored asgenes within the organism’s DNA genome. Thenumber of human genes is estimated to be 30,000 to100,000. Within any individual cell, only a portionof the possible gene set is present as protein. Someof the proteins present in a single cell are likely tobe present in all cells because they serve functionsrequired in every type of cell, and can be thought ofas ‘‘housekeeping’’ proteins. Other proteins serveAddress all correspondence to Yidong Chen. NIH/NHGRI/LCG, Bldg.49, Rm. 4B24, 49 Convent Drive, MSC 4470, Bethesda, MD 20892-4470. Tel:(301) 402-3150; Fax: (301) 402-3241; E-mail: [email protected]/97/$10.00 © 1997 SPIEJOURNAL OF BIOMEDICAL OPTICS 2(4), 364–374 (OCTOBER 1997)364 JOURNAL OF BIOMEDICAL OPTICSdOCTOBER 1997dVOL.2NO.4specialized functions only required in particularcell types. For example, muscle cells contain spe-cialized proteins that form the dense contractile fi-bers of a muscle. Given that a large part of a cell’sspecific functionality is determined by the genes itis expressing, it is logical that transcription, the firststep in the process of converting the genetic infor-mation stored in an organism’s genome into pro-tein, would be highly regulated by the control net-work that coordinates and directs cellular activity.Regulation is readily observed in studies thatscrutinize activities evident in cells configuringthemselves for a particular function (specializationinto a muscle cell) or state (active multiplication orquiescence). As cells alter their status, coordinatetranscription of the protein sets required for thisstate can be observed. As a window both on cellstatus and on the system controlling the cell, de-tailed, global knowledge of the transcriptional statecould provide a broad spectrum of information use-ful to biologists. Knowledge of when and in whattypes of cell the protein product of a gene of un-known function is expressed would provide usefulclues as to the likely function of that gene. Determi-nation of gene expression patterns in normal cellscould provide detailed knowledge of the way inwhich the control system achieves the highly coor-dinated activation and deactivation required for de-velopment and differentiation of a mature organ-ism from a single fertilized egg. Comparison ofgene expression patterns in normal and pathologi-cal cells could provide useful diagnostic ‘‘finger-prints’’ and help identify aberrant functions thatwould be reasonable targets for therapeutic inter-vention.The ability to carry out studies in which the tran-scriptional state of a large number of genes is deter-mined has, until recently, been severely inhibitedby limitations on our ability to survey cells for thepresence and abundance of a large number of genetranscripts in a single experiment. A primary limi-tation has been the small number of identifiedgenes. In the case of humans, only a few thousandof the complete set (30,000 to 100,000 genes) havebeen


View Full Document

CORNELL CS 726 - RATIO-BASED DECISIONS AND THE QUANTITATIVE ANALYSIS OF CDNA MICROARRAY IMAGES

Download RATIO-BASED DECISIONS AND THE QUANTITATIVE ANALYSIS OF CDNA MICROARRAY IMAGES
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view RATIO-BASED DECISIONS AND THE QUANTITATIVE ANALYSIS OF CDNA MICROARRAY IMAGES and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view RATIO-BASED DECISIONS AND THE QUANTITATIVE ANALYSIS OF CDNA MICROARRAY IMAGES 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?