DOC PREVIEW
Berkeley STATISTICS 246 - Gene expression

This preview shows page 1-2-17-18-19-35-36 out of 36 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 36 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 36 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 36 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 36 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 36 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 36 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 36 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 36 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Gene expressionStatistics 246, Week 3Thesis: the analysis of geneexpression data is going to be bigin 21st century statisticsMany different technologies, includingHigh-density nylon membrane arraysSerial analysis of gene expression (SAGE)Short oligonucleotide arrays (Affymetrix)Long oligo arrays (Agilent)Fibre optic arrays (Illumina)cDNA arrays (Brown/Botstein)*1995 1996 1997 1998 1999 2000 20010100200300400500600(projected)YearNumber of papersTotal microarray articles indexed in MedlineCommon themes• Parallel approach to collection of very largeamounts of data (by biological standards)• Sophisticated instrumentation, requires someunderstanding• Systematic features of the data are at least asimportant as the random ones• Often more like industrial process than singleinvestigator lab research• Integration of many data types: clinical,genetic, molecular…..databasesBiological backgroundG T A A T C C T C | | | | | | | | |C A T T A G G A GDNAG U A A U C CRNApolymerasemRNATranscriptionIdea: measure the amount of mRNA to see whichgenes are being expressed in (used by) the cell.Measuring protein might be better, but is currentlyharder.Reverse transcriptionClone cDNA strands, complementary to the mRNAG U A A U C C U CReversetranscriptasemRNAcDNA C A T T A G G A G C A T T A G G A G C A T T A G G A G C A T T A G G A GT T A G G A G C A T T A G G A G C A T T A G G A G C A T T A G G A G C A T T A G G A G C A T T A G G A GcDNA microarray experiments mRNA levels compared in many different contextsDifferent tissues, same organism (brain v. liver)Same tissue, same organism (ttt v. ctl, tumor v. non-tumor)Same tissue, different organisms (wt v. ko, tg, or mutant)Time course experiments (effect of ttt, development)Other special designs (e.g. to detect spatial patterns).cDNA microarrayscDNA clonescDNA microarraysCompare the genetic expression in two samples of cellsPRINTcDNA from onegene on each spotSAMPLEScDNA labelled red/greene.g. treatment / control normal / tumor tissueHYBRIDIZEAdd equal amounts oflabelled cDNA samplesto microarray.SCANLaserDetectorBiological questionDifferentially expressed genesSample class prediction etc.TestingBiological verification and interpretationMicroarray experimentEstimationExperimental designImage analysisNormalizationClusteringDiscriminationR, G16-bit TIFF files(Rfg, Rbg), (Gfg, Gbg)Some statistical questionsImage analysis: addressing, segmenting, quantifyingNormalisation: within and between slidesQuality: of images, of spots, of (log) ratiosWhich genes are (relatively) up/down regulated?Assigning p-values to tests/confidence to results.Some statistical questions, ctdPlanning of experiments: design, sample sizeDiscrimination and allocation of samplesClustering, classification: of samples, of genesSelection of genes relevant to any given analysisAnalysis of time course, factorial and other specialexperiments…..…...& much more.Some bioinformatic questionsConnecting spots to databases, e.g. to sequence,structure, and pathway databasesDiscovering short sequences regulating sets ofgenes: direct and inverse methodsRelating expression profiles to structure andfunction, e.g. protein localisationIdentifying novel biochemical or signallingpathways, ………..and much more.Part of the image of one channel false-coloured on a white (v. high) red(high) through yellow and green (medium) to blue (low) and black scaleDoes one size fit all?Segmentation: limitation of thefixed circle methodSRGFixed CircleInside the boundary is spot (foreground), outside is not.Some local backgroundsWe use something different again: a smaller, less variable value.Single channelgrey scaleQuantification of expressionFor each spot on the slide we calculate Red intensity = Rfg - Rbgfg = foreground, bg = background, and Green intensity = Gfg - Gbgand combine them in the log (base 2) ratio Log2( Red intensity / Green intensity)Gene Expression Data On p genes for n slides: p is O(10,000), n is O(10-100), but growing,GenesSlidesGene expression level of gene 5 in slide 4 =Log2( Red intensity / Green intensity)slide 1 slide 2 slide 3 slide 4 slide 5 …1 0.46 0.30 0.80 1.51 0.90 ...2 -0.10 0.49 0.24 0.06 0.46 ...3 0.15 0.74 0.04 0.10 0.20 ...4 -0.45 -1.03 -0.79 -0.56 -0.32 ...5 -0.06 1.06 1.35 1.09 -1.09 ...These values are conventionally displayed on a red (>0) yellow (0) green (<0) scale.The red/green ratios can be spatially biased• .Top 2.5%of ratios red, bottom 2.5% of ratios greenThe red/green ratios can be intensity-biasedM = log2R/G = log2R - log2G= (log2R + log2G )/2Values should scatter about zero.Yellow: GAPDH, tubulin Light blue: MSP pool / titrationOrange: Schadt-Wong rank invariant set Red line: lowess smooth Normalization: how we “fix” the previous problemThe curved line becomes the new zero lineNormalizing: before20-2-46 8 10 12 14 16MNormalizing: after20-2-4M normalised6 8 10 12 14 16OlfactoryEpitheliumVomeroNasal OrganMain (Auxiliary)Olfactory BulbFrom Buck (2000)From a study of the mouse olfactory systemAxonal connectivity between the nose and the mouse olfactory bulb>2M, ~1,800 typesTwo principles: “zone-to-zone projection”, and “glomerular convergence”NeocortexOf interest: the hardwiring of thevertebrate olfactory system• Expression of a specific odorant receptor gene byan olfactory neuron.• Targeting and convergence of like axons to specificglomeruli in the olfactory bulb.The biological question in this caseAre there genes with spatiallyrestricted expression patterns withinthe olfactory bulb?Layout of the cDNA Microarrays• Sequence verified mouse cDNAs• 19,200 spots in two print groups of 9,600 each– 4 x 4 grid, each with 25 x24 spots– Controls on the first 2 rows of each grid.77pg1 pg2Design: How We Sliced Up the BulbAPDVMLDesign: Two Ways to Do theComparisonsGoal: 3-D representation of gene expressionPDMAVLRCompare all samples to acommon referencesample (e.g., whole bulb)PDMAVLMultiple direct comparisonsbetween different samples(no common reference)An Important Aspect of Our DesignDifferent ways of estimatingthe same contrast:e.g. A compared to P Direct = A-P Indirect = A-M + (M-P) or A-D + (D-P) or


View Full Document

Berkeley STATISTICS 246 - Gene expression

Documents in this Course
Meiosis

Meiosis

46 pages

Meiosis

Meiosis

47 pages

Load more
Download Gene expression
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Gene expression and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Gene expression 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?