DOC PREVIEW
MIT 6 006 - Lecture Notes

This preview shows page 1-2-3-23-24-25-26-46-47-48 out of 48 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 48 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Kellis slidesIndyk slidesDemaine slidesQuick intro to the MIT CompBio GroupDanielMarbachMike LinJasonErnstJessicaWuRachelSealfonPouyaKheradpourManolisKellisChrisBristowLoyalGoffIrwinJungreisWho we areSushmitaRoyLuke WardStata4Stata3LouisaDiStefanoDaveHendrixAngelaYenBenHolmesSoheilFeiziMukulBansalBobAltshulerStefanWashietlWhat we do: Research synopsis• Why biology in a computer science group?• Fundamental biological questions: 1. Interpreting the human genome. 2. Revealing the logic of gene regulation. 3. Principles of evolutionary change. • Algorithmic/machine learning methods: – Comparative genomics: evolutionary signatures– Regulatory genomics: motifs, networks, models– Epigenomics: chromatin states, dynamics, disease– Phylogenomics: evolution at the genome scale• Defining characteristics of our group:– Learn genomic rules, exploit nature of problems– Interdisciplinary collaborations, high biology impact(1) Comparative genomics: evolutionary signaturesProtein-coding signatures• 1000s new coding exons• Translational readthrough• Overlapping constraintsNon-coding RNA signatures• Novel structural families• Targeting, editing, stability• Structures in coding exonsmicroRNA signatures:• Novel/expanded miR families• miR/miR* arm cooperation• Sense/anti-sense switchesRegulatory motif signatures• Systematic motif discovery• Regulatory motif instances• TF/miRNA target networks• Single binding-site resolution(2) Regulatory genomics: circuits, predictive models• Initial annotation of the non‐coding genome, from 20% to 70%• Systems biology for an animal genome for the first time possible• Students and postdocs are co‐first authors, leadership rolesPredictive models of gene regulation• Infer networks• Predict function• Predict regulators• Predict gene expression ENCODE/modENCODE• 4-year effort, dozens of experimental labs• Integrative analysis• Systematic genome annotation• Flagship NIH projectGenerative model1. Family rate2. Species-specific ratesSiFj~gamma(α,β)~normal(μi,σi)Selective pressures on gene functionPopulation dynamics of the speciesTwo components of gene evolution(3) Phylogenomics: Bayesian gene‐tree reconstructionNew phylogenomic pipelineLearned Fj,SidistributionsBirth‐DeathprocessBranch lengthpriorSequencelikelihoodLength I, Topology T, Reconciliation RTopologypriorHKY model(traditional)Alignment data D, species‐level parameters θBayesianformulation(4) Vignette on EpigenomicsUsing chromatin information to understand human diseasesJason ErnstPouyaKheradpourChallenge of data integration in many marks/cells• Dozens of chromatin tracks Understand their function Reveal their combinations Annotate systematically• Our approach: learn common chromatin states Explicitly model combinations Unsupervised approach, probabilistic modelConstruct antibodiespull down chromatin ChIP-seq tracksHistone tailmodifications(marks)HistonesHistonetailsOur approach: Multivariate Hidden Markov Model (HMM)9TSSEnhancerDNABinarizedchromatin marks. Called based on a poisson distributionMost likely Hidden StateTranscribed Region1:3:4:5:6:High Probability Chromatin Marks in State2:0.80.90.90.80.70.9200 base pair intervalAll probabilities are learned from the data2H3K4me3H3K36me3H3K36me3H3K36me3H3K36me3H3K4me1H3K4me3H3K4me1H3K27ac0.8H3K4me1H3K36me3K27acK4me1H3K4me3H3K4me3H3K4me1H3K4me11346 66 6655 5UnobservedBinarization leads to explicit modeling of mark combinations and interpretable parametersEmission distribution is a product of independent Bernoulli random variablesErnst and Kellis, Nat Biotech 2010From ‘chromatin marks’ to ‘chromatin states’•Learn de novo significant combinations of chromatin marks•Reveal functional elements, even without looking at sequence•Use for genome annotation•Use for studying regulation dynamics in different cell typesPromoter statesTranscribed statesActive IntergenicRepressedErnst and Kellis, Nat Biotech 2010ENCODE: Study nine marks in nine human cell lines9 human cell types9 marksH3K4me1H3K4me2H3K4me3H3K27acH3K9acH3K27me3H4K20me1H3K36me3CTCF+WCE+RNAHUVEC Umbilical vein endothelialNHEK KeratinocytesGM12878 LymphoblastoidK562 Myelogenous leukemiaHepG2 Liver carcinomaNHLF Normal human lung fibroblastHMEC Mammary epithelial cellHSMM Skeletal muscle myoblastsH1 Embryonicx81 Chromatin Mark Tracks (281combinations)Ernst et al, Nature 2011•Learned jointlyacross celltypes(virtualconcatenation)•State definitionsare common•State locationsare dynamicBrad Bernstein ENCODE Chromatin GroupChromatin states dynamics across nine cell types• Single annotation track for each cell type• Summarize cell-type activity at a glance• Can study 9-cell activity pattern acrossCorrelatedactivityPredictedlinkingMulti‐cell activity profiles and their correlationsHUVECNHEKGM12878K562HepG2NHLFHMECHSMMH1GeneexpressionChromatinStatesActive TF motifenrichmentONOFFActive enhancerRepressedMotif enrichmentMotif depletionTF regulatorexpressionTF OnTF OffDip-alignedmotif biasesMotif alignedFlat profileChromatin state & gene expression  link enhancers and target genesTF motif enrichment & TF expression  reveal activators / repressorsEx2: Gfi1 repressor of K562/GM cellsEx1: Oct4 predicted activator of embryonic stem (ES) cellsCoordinated activity reveals activators/repressors• Enhancer networks: Regulator  enhancer  target geneActivity signatures for each TFEnhancer activityxx• Disease-associated SNPs enriched for enhancers in relevant cell types• E.g. lupus SNP in GM enhancer disrupts Ets1 predicted activatorRevisiting disease‐associated variantsTitle Author/JournalTotal#SNPsFold Cell Type# SNPs in enhancersFDRMultiple loci influence erythrocyte phenotypes in the CHARGE Consortium.Ganesh et al Nat Genet 200935 17 K562 90.02Biological, clinical and population relevance of 95 loci for blood lipidsTeslovich et al Nature 2010101 11 HepG2 13 0.02Genome‐wide association study meta‐analysis identifies seven new rheumatoid arthritis risk lociStahl et alNat Genet 201029 15 GM12878 70.03Genome‐wide meta‐analyses identify three loci associated with primary biliary cirrhosisLiu et al Nat Genet 2010641GM12878 40.03Chinese Han population identifies nine new susceptibility loci for systemic lupus erythematosus.Han et alNat Genet 200918 21 GM12878 60.03Six new loci associated with blood low‐density lipoprotein cholesterol, high‐density lipoprotein cholesterol or


View Full Document

MIT 6 006 - Lecture Notes

Documents in this Course
Quiz 1

Quiz 1

7 pages

Quiz 2

Quiz 2

12 pages

Quiz 2

Quiz 2

9 pages

Quiz 1

Quiz 1

10 pages

Quiz 2

Quiz 2

11 pages

Quiz 1

Quiz 1

12 pages

Graphs

Graphs

27 pages

Load more
Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?