DOC PREVIEW
CMU BSC 03510 - Lecture
Pages 48

This preview shows page 1-2-3-23-24-25-26-46-47-48 out of 48 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Computational Biology, Part 14 Expression array cluster analysisGene Expression MicroarrayPowerPoint PresentationGene Expression in different organsGene Expression in single organMicroarray raw dataExample microarray imageSlide 8Data extractionSlide 10DistancesSlide 12Slide 13Slide 14General Multivariate DatasetMultivariate Sample MeanMultivariate VarianceSlide 18Covariance MatrixSlide 20Univariate DistanceUnivariate z-score DistanceBivariate Euclidean DistanceMultivariate Euclidean DistanceEffects of variance and covariance on Euclidean distanceMahalanobis DistanceOther distance functionsPearson correlation coefficientSoftware for performing microarray cluster analysisInput data for clusteringHierarchical vs. k-means clusteringSlide 32Slide 33Slide 34Slide 35Slide 36Slide 37Slide 38Slide 39Slide 40Slide 41Slide 42Slide 43K-means QuestionsSlide 45Slide 46Choosing the number of CentersClustering genes and conditionsComputational Biology, Part 14Expression array cluster analysisComputational Biology, Part 14Expression array cluster analysisRobert F. Murphy, Shann-Ching ChenRobert F. Murphy, Shann-Ching ChenCopyright Copyright  2004-2007. 2004-2007.All rights reserved.All rights reserved.Gene Expression MicroarrayGene Expression MicroarrayA popular method to A popular method to detect mRNA expression level since reported by since reported by Pat Brown Laboratory in 1995 and by Pat Brown Laboratory in 1995 and by Affymetrix in 1996Affymetrix in 1996Different technologies for producing the Different technologies for producing the microarray chips and different approaches microarray chips and different approaches for analyzing microarray datafor analyzing microarray dataShould carefully process and analyze the dataShould carefully process and analyze the dataWhat is gene expression and why is it important?Gene Expression in different organsGene Expression in different organsAA highly specific process in which a gene is switched highly specific process in which a gene is switched on, and therefore begins production of its protein.on, and therefore begins production of its protein.Sources: image from the Cancer Genome Anatomy Project (CGAP), Conceptual Tour, July 21, 2000. http://www.ncbi.nlm.nih.gov/Class/MLACourse/Modules/MolBioReview/gene_expression.htmlGene Expression in single organGene Expression in single organGene expression also varies within a certain Gene expression also varies within a certain type of cell at different points in time. type of cell at different points in time. Sources: image from the Cancer Genome Anatomy Project (CGAP), Conceptual Tour, July 21, 2000. http://www.ncbi.nlm.nih.gov/Class/MLACourse/Modules/MolBioReview/gene_expression.htmlMicroarray raw dataMicroarray raw dataLabel mRNA from one sample with a red Label mRNA from one sample with a red fluorescence probe (Cy5) and mRNA from fluorescence probe (Cy5) and mRNA from another sample with a green fluorescence another sample with a green fluorescence probe (Cy3)probe (Cy3)Hybridize to a chip with specific DNAs Hybridize to a chip with specific DNAs fixed to each wellfixed to each wellMeasure amounts of green and red Measure amounts of green and red fluorescencefluorescenceFlash animations: PCR http://www.maxanim.com/genetics/PCR/PCR.htm Microarray http://www.bio.davidson.edu/Courses/genomics/chip/chip.htmlExample microarray imageExample microarray imageMicroarray is a technology to globally (simultaneously detecting thousands of genes) detect mRNA expression level.mRNA expression microarray data for 9800 genes (gene number shown vertically) for 0 to 24 h (time shown horizontally) after addition of serum to a human cell line that had been deprived of serum (from http://genome-www.stanford.edu/serum)Data extractionData extractionAdjust fluorescent intensities using Adjust fluorescent intensities using standards (as necessary)standards (as necessary)Calculate ratio of red to green fluorescenceCalculate ratio of red to green fluorescenceConvert to logConvert to log22 and round to integer and round to integer Display saturated green=-3 to black = 0 to Display saturated green=-3 to black = 0 to saturated red = +3saturated red = +3Many different types:• Hierarchical clustering • k – means clustering• Self-organising maps• Hill Climbing• Simulated AnnealingAll have the same three basic tasks of:1. Pattern representation – patterns or features in the data.2. Pattern proximity – a measure of the distance or similarity defined on pairs of patterns3. Pattern grouping – methods and rules used in grouping the patterns Unsupervised clustering algorithmsDistancesDistancesHigh dimensionalityHigh dimensionalityBased on Based on vector vector geometrygeometry – how close – how close are two data points?are two data points?Array2Array 1 Array 1 Array 2Gene 1 1 4… Gene 1DistancesDistancesHigh dimensionalityHigh dimensionalityBased on Based on vector vector geometrygeometry – how close – how close are two data points?are two data points?Array2Array 1 Array 1 Array 2Gene 1 1 4Gene 2 1 3… Gene 1Gene 2Distance(Gene 1, Gene 2) = 1DistancesDistancesHigh dimensionalityHigh dimensionalityBased on Based on vector vector geometrygeometry – how close – how close are two data points?are two data points?Based on distances to Based on distances to determine clustersdetermine clustersArray2Array 1 Array 1 Array 2Gene 1 1 4Gene 2 1 3… Gene 1Gene 2Distance(Gene 1, Gene 2) = 1a1a2 b2b1DistanceSample 2Sample 1Gene aGene bsample 1 sample 2a1 a2b1 b2Bivariate Euclidean DistanceGeneral Multivariate DatasetGeneral Multivariate DatasetWe are given values of We are given values of pp variables for variables for nn independent observationsindependent observationsConstruct an Construct an n n xx p p matrix matrix MM consisting of consisting of vectors vectors XX11 through through XXnn each of length each of length ppMultivariate Sample MeanMultivariate Sample MeanDefine mean vector Define mean vector II of length of length ppI(j) =M(i, j)i=1n∑nI =Xii=1n∑normatrix notationvector notationMultivariate VarianceMultivariate VarianceDefine variance vector Define variance vector  of length of length ppσ2(j) =M(i, j)−I(j)( )i=1n∑2n−1matrix notationMultivariate VarianceMultivariate Varianceororσ=Xi−I( )i=1n∑2n−1vector notationCovariance MatrixCovariance MatrixDefine a


View Full Document

CMU BSC 03510 - Lecture

Download Lecture
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?