DOC PREVIEW
Berkeley STATISTICS 246 - Gene expression

This preview shows page 1-2-17-18-19-35-36 out of 36 pages.

Save
View full document
Premium Document
Do you want full access? Go Premium and unlock all 36 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Gene expression Statistics 246 Week 3 Thesis the analysis of gene expression data is going to be big in 21st century statistics Many different technologies including High density nylon membrane arrays Serial analysis of gene expression SAGE Short oligonucleotide arrays Affymetrix Long oligo arrays Agilent Fibre optic arrays Illumina cDNA arrays Brown Botstein Total microarray articles indexed in Medline 600 Number of papers 500 400 300 200 100 0 1995 1996 1997 1998 1999 2000 2001 projected Year Common themes Parallel approach to collection of very large Sophisticated instrumentation requires some amounts of data by biological standards understanding Systematic features of the data are at least as important as the random ones Often more like industrial process than single investigator lab research Integration of many data types clinical genetic molecular databases Biological background Transcription DNA G T A A T C C T C C A T T A G G A G RNA polymerase mRNA G U A A U C C Idea measure the amount of mRNA to see which genes are being expressed in used by the cell Measuring protein might be better but is currently harder Reverse transcription Clone cDNA strands complementary to the mRNA mRNA G U A A U C C U C Reverse transcriptase cDNA T T A G G A G C A T T A G A G G G CT G C A T A G G A G A A T A C A TT TT AA G G A GG G CTA AG GA GG A G G C A TT ATG T A G AG GA G A G A C ACTA TT ATG G C cDNA microarray experiments mRNA levels compared in many different contexts Different tissues same organism Same tissue same organism brain v liver ttt v ctl tumor v non tumor Same tissue different organisms wt v ko tg or mutant Time course experiments effect of ttt development Other special designs e g to detect spatial patterns cDNA microarrays cDNA clones cDNA microarrays Compare the genetic expression in two samples of cells PRINT cDNA from one gene on each spot SAMPLES cDNA labelled red green e g treatment control normal tumor tissue HYBRIDIZE Add equal amounts of labelled cDNA samples to microarray SCAN Laser Detector Biological question Differentially expressed genes Sample class prediction etc Experimental design Microarray experiment 16 bit TIFF files Image analysis Rfg Rbg Gfg Gbg Normalization R G Estimation Testing Clustering Biological verification and interpretation Discrimination Some statistical questions Image analysis addressing segmenting quantifying Normalisation within and between slides Quality of images of spots of log ratios Which genes are relatively up down regulated Assigning p values to tests confidence to results Some statistical questions ctd Planning of experiments design sample size Discrimination and allocation of samples Clustering classification of samples of genes Selection of genes relevant to any given analysis Analysis of time course factorial and other special experiments much more Some bioinformatic questions Connecting spots to databases e g to sequence structure and pathway databases Discovering short sequences regulating sets of genes direct and inverse methods Relating expression profiles to structure and function e g protein localisation Identifying novel biochemical or signalling pathways and much more Part of the image of one channel false coloured on a white v high red high through yellow and green medium to blue low and black scale Does one size fit all Segmentation limitation of the fixed circle method SRG Fixed Circle Inside the boundary is spot foreground outside is not Some local backgrounds Single channel grey scale We use something different again a smaller less variable value Quantification of expression For each spot on the slide we calculate Red intensity Rfg Rbg fg foreground bg background and Green intensity Gfg Gbg and combine them in the log base 2 ratio Log Red intensity Green intensity 2 Gene Expression Data On p genes for n slides p is O 10 000 n is O 10 100 but growing Slides 1 Genes slide 1 0 46 slide 2 0 30 2 0 10 0 49 4 0 45 1 03 3 5 0 15 0 06 0 74 1 06 slide 3 0 80 0 24 0 04 0 79 1 35 slide 4 1 51 0 06 0 10 0 56 1 09 slide 5 0 46 0 90 0 20 0 32 1 09 Gene expression level of gene 5 in slide 4 Log Red intensity Green intensity 2 These values are conventionally displayed on a red 0 yellow 0 green 0 scale The red green ratios can be spatially biased Top 2 5 of ratios red bottom 2 5 of ratios green The red green ratios can be intensity biased M log R G 2 log R log G 2 2 Values should scatter about zero log R log G 2 2 2 Normalization how we fix the previous problem The curved line becomes the new zero line Orange Schadt Wong rank invariant set Yellow GAPDH tubulin Light blue MSP pool titration Red line lowess smooth 4 2 0 M 2 Normalizing before 6 8 10 12 14 16 4 2 0 M normalised 2 Normalizing after 6 8 10 12 14 16 From a study of the mouse olfactory system Main Auxiliary Olfactory Bulb VomeroNasal Organ Olfactory Epithelium From Buck 2000 Axonal connectivity between the nose and the mouse olfactory bulb 2M 1 800 types Neocortex Two principles zone to zone projection and glomerular convergence Of interest the hardwiring of the vertebrate olfactory system Expression of a specific odorant receptor gene by Targeting and convergence of like axons to specific an olfactory neuron glomeruli in the olfactory bulb The biological question in this case Are there genes with spatially restricted expression patterns within the olfactory bulb Layout of the cDNA Microarrays Sequence verified mouse cDNAs 19 200 spots in two print groups of 9 600 each 4 x 4 grid each with 25 x24 spots Controls on the first 2 rows of each grid 77 pg1 pg2 Design How We Sliced Up the Bulb A P D L V M Design Two Ways to Do the Comparisons Goal 3 D representation of gene expression Compare all samples to a Multiple direct comparisons sample e g whole bulb no common reference common reference A between different samples L V V R D M A M D P L P An Important Aspect of Our Design A D Different ways of estimating the same contrast e g A compared to P M L Direct A P Indirect A M M P A D D P or or L A P L V P How do we combine these


View Full Document

Berkeley STATISTICS 246 - Gene expression

Documents in this Course
Meiosis

Meiosis

46 pages

Meiosis

Meiosis

47 pages

Load more
Download Gene expression
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Gene expression and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Gene expression and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?