GraduateGraduate ComputationalComputationalGenomicsGenomics02-710 / 10-81002-710 / 10-810 & MSCBIO2070& MSCBIO2070Modeling Regulatory NetworksModeling Regulatory Networks(part II)(part II)Takis BenosTakis BenosLecture #25a, April 24, 2007Lecture #25a, April 24, 2007Reading: Reading: handouts & papershandouts & papersBenos 02-710/MSCBIO2070 24-APR-2007 2GRNsGRNs: the data sources: the data sourcesGene expression ChIP-chipMicroarrays• SAGE• Real-time RT-PCRPromoter arrays• Tiling arraysBenos 02-710/MSCBIO2070 24-APR-2007 3The study that changedThe study that changedeverythingeverythingBenos 02-710/MSCBIO2070 24-APR-2007 4LeeLee et al.et al.: the method: the methodBenos 02-710/MSCBIO2070 24-APR-2007 5Homologous recombinationHomologous recombinationBenos 02-710/MSCBIO2070 24-APR-2007 6Lee Lee et al.et al.: the results: the resultsBenos 02-710/MSCBIO2070 24-APR-2007 7Lee Lee et al.et al.: the results : the results ((cntdcntd))Benos 02-710/MSCBIO2070 24-APR-2007 8Gao Gao et al.et al.: combining : combining ChIP-ChIP-chip chip and microarray dataand microarray dataBenos 02-710/MSCBIO2070 24-APR-2007 9Gao Gao et al.et al.: the method: the methodFftBenos 02-710/MSCBIO2070 24-APR-2007 10Gao Gao et al.et al.: results: resultsBenos 02-710/MSCBIO2070 24-APR-2007 11Gao Gao et al.et al.: results (: results (cntdcntd))Benos 02-710/MSCBIO2070 24-APR-2007 12Gao Gao et al.et al.: results (: results (cntdcntd))Benos 02-710/MSCBIO2070 24-APR-2007 13Gao Gao et al.et al.: results (: results (cntdcntd))Benos 02-710/MSCBIO2070 24-APR-2007 14Gao Gao et al.et al.: Conclusions: Conclusions• A new method that combines both ChIP and gene expressiondata to assess transcription regulation.• Half of the TF targets resulting from the ChIP assay (p<10-3)are non-functional. (why?)• ChIP false positives• Competition with other factors• Chromatin structureBenos 02-710/MSCBIO2070 24-APR-2007 15For some readingFor some reading• Lee et al., “Transcriptional regulatory networks in Saccharomycescerevisiae” (2002) Science 298: 799-804.• Gao et L., “Defining transcriptional networks through integrative modelingof mRNA expression and transcription factor binding data” (2004) BMCBioinformatics 5: 31.GraduateGraduate ComputationalComputationalGenomicsGenomics02-710 / 10-81002-710 / 10-810 & MSCBIO2070& MSCBIO2070Evolution of regulatory regions andEvolution of regulatory regions andbinding motifsbinding motifsTakis BenosTakis BenosLecture #25b, April 24, 2007Lecture #25b, April 24, 2007Reading: Reading: handouts & papershandouts & papersBenos 02-710/MSCBIO2070 24-APR-2007 17Familial binding profiles (FBS)Familial binding profiles (FBS)Source: Sandelin & Wasserman (2004) J. Mol. Biol. 338, 207-215Benos 02-710/MSCBIO2070 24-APR-2007 18Comparing motifsComparing motifsBenos 02-710/MSCBIO2070 24-APR-2007 19SOMBRERO & FBP PriorsSOMBRERO & FBP PriorsStart: BP-SOM trained on 257mammalian PSSMs (incl. REL)Finish: SOMBRERO used to findmotifs of NF-κB (REL-class).Benos 02-710/MSCBIO2070 24-APR-2007 20Binding Profile SOM (BP-SOM)Binding Profile SOM (BP-SOM)Benos 02-710/MSCBIO2070 24-APR-2007 21SOMBRERO: resultsSOMBRERO: resultsBenos 02-710/MSCBIO2070 24-APR-2007 22BLAST-like motif searchesBLAST-like motif searchesMotif DBList of motif “hits”Input motifBenos 02-710/MSCBIO2070 24-APR-2007 23The tools: distance metricsThe tools: distance metricsSimilarity metric Formula Pearson Correlation Coefficient (PCC) ! PCC(X,Y) =( fX(b) " fXb= AT#) $ ( fY(b) " fY)( fX(b) " fXb= AT#)2$ ( fY(b) " fY)2b= AT# Chi-square (pCS ) (1 - p-value o f ) ! %32(X,Y ) =(nK(b) " nKe(b))2nKe(b)b= AT#K ={X ,Y }# Average Kullback-Leibler (AKL ) ! AKL(X,Y ) = 10 "fX(b) $b= AT#logfX(b)fY(b)+ fY(b) $b= AT#logfY(b)fX(b))2 Sum of squared distances (SSD ) ! SSD(X,Y ) = 2 " ( fX(b) " fY(b))2b= AT# Average Log-likelihood Ratio (ALLR) ! ALLR(X,Y ) =nX(b) $ logfY(b)pref(b)b= AT#+ nY(b) $ logfX(b)pref(b)b= AT#(nX(b) + nY(b))b= AT# ALLR with Lower Limit (ALLR_LL) Same as above, but a lower limit of -2 is imposed on the score (see text)Benos 02-710/MSCBIO2070 24-APR-2007 24Distance metrics (cntd)Distance metrics (cntd)Benos 02-710/MSCBIO2070 24-APR-2007 25Distance metrics (cntd)Distance metrics (cntd)ACGTfI(A)fI(C)fI(C)fI(C)ACGT0.250.250.250.25! H = fI(b)b= AT"# log2fI(b)Benos 02-710/MSCBIO2070 24-APR-2007 26Distance metrics (cntd)Distance metrics (cntd)Benos 02-710/MSCBIO2070 24-APR-2007 27The tools: alignments andThe tools: alignments andtreestrees Alignment methods: Needleman-Wunsch Smith-Waterman Tree-building methods: UPGMA SOTABenos 02-710/MSCBIO2070 24-APR-2007 28Motif alignmentsMotif alignmentsAlignment algorithm Similarity metric Gap open JASPAR non-ZNF JASPAR ZNF TRANSFAC non-ZNF TRANSFAC ZNF Avg S W SS D 1 . 0 0 0.84 5 0.48 0 0.83 3 0.78 8 0 . 8 1 1 SW (ovrlp) SS D 1 . 0 0 0.84 5 0.48 0 0.83 3 0.78 8 0 . 8 1 1 S W SS D 0 . 7 5 0.84 5 0.44 0 0.83 1 0.79 4 0 . 8 0 9 SW (ovrlp) SS D 0 . 7 5 0.84 5 0.44 0 0.83 1 0.79 4 0 . 8 0 9 S W SS D 0 . 5 0 0.84 5 0.52 0 0.82 4 0.79 4 0 . 8 0 8 SW (ovrlp) SS D 0 . 5 0 0.84 5 0.52 0 0.82 4 0.79 4 0 . 8 0 8 S W P C C 1 . 5 0 0.84 5 0.52 0 0.81 7 0.80 0 0 . 8 0 5 S W SS D 0 . 2 5 0.85 9 0.56 0 0.80 8 0.80 6 0 . 8 0 4 SW (ovrlp) SS D 0 . 2 5 0.85 9 0.56 0 0.80 8 0.80 6 0 . 8 0 4 S W P C C 1 . 0 0 0.83 1 0.60 0 0.81 5 0.78 8 0 . 8 0 2 SW (ovrlp) P C C 1 . 0 0 0.83 1 0.60 0 0.81 2 0.78 8 0 . 8 0 1 SW (ungap) P C C N / A 0.88 7 0.60 0 0.81 2 0.76 3 0 . 8 0 1 S W P C C 1 . 5 0 0.84 5 0.52 0 0.81 0 0.80 0 0 . 8 0 1 N W SS D 1 0 0 0 0.85 9 0.48 0 0.81 7 0.78 1 0 . 8 0 1 S W SS D 1 0 0 0 0.85 9 0.48 0 0.81 7 0.78 1 0 . 8 0 1 … … … … … … … … N W p C S 0 . 7 5 0.66 2 0.60 0 0.65 7 0.67 5 0 . 6 6 0 N W p C S 1 . 0 0 0.66 2 0.44 0 0.65 3 0.68 8 0 . 6 5 4 N W p C S 0 . 5 0 0.69 0 0.56 0 0.65 3 0.65 0 0 . 6 5 2 N W p C S 0 . 2 5 0.63 4 0.64 0 0.65 0 0.65 6 0 . 6 5 0 N W A L L R 5 . 0 0 0.63 4 0.48 0 0.64 8 0.61 9 0 . 6 3 3 N W AK L 4 . 0 0 0.70 4 0.40 0 0.59 4 0.60 0 0 . 6 0 0 N W p C S 2 . 0 0 0.56 3 0.40 0 0.59 6 0.65 6 0 . 6 0 0 N W SS D 1 . 0 0 0.63 4 0.48 0 0.58 5 0.58 1 0 . 5 8 5 N W P C C 1 . 5 0 0.60 6 0.52 0 0.58 0 0.58 1 0 . 5 8 1 N W P C C 2 . 0 0 0.52 1 0.48 0 0.42 3 0.50 0 0 . 4 5 3 N W A L L R 10.0 0
View Full Document