Slide 1Comparison of Networks Across SpeciesIn the beginning there was DNA……then came protein interactionsComparative Genomics to Comparative InteractomicsNetwork comparisons allow us to:What is a Protein Interaction Network?The Network Alignment ProblemExample Network AlignmentGeneral Framework For Network Alignment AlgorithmsTwo Algorithms Discussed TodayOverview ofEstimation of Interaction ProbabilitiesNetwork Alignment GraphsExample Network Alignment GraphScoring FunctionLog Likelihood Ratio ModelLikelihood Ratio Scoring of a Protein Complex in a Single SpeciesSlide 19Example of Complex ScoringAlignment algorithmSlide 22Slide 23Slide 24ResultsSlide 26Slide 27Input to the AlgorithmDefinition of an alignmentSlide 30Log Likelihood Ratio Model (Recap)Scoring Equivalence ClassesScoring Alignment EdgesScoring: ESMAlignment Algorithm: d-Clusters for Seed GenerationSlide 36Alignment Algorithm: Generating An Initial Alignment From The SeedAlignment Algorithm: Greedy Seed Extension PhaseAlignment Algorithm: Multiple AlignmentPerformance Comparison: Speed-sensitivity / Linear ScalingPerformance Comparison: Multiple AlignmentPerformance Comparison: Module QueryingSlide 43Slide 44Slide 45The Future of Network ComparisonThat’s all folks!Slide 48Performance Comparison: SensitivityScoring Sequence MutationsComparison of Networks Across SpeciesCS374 Presentation October 26, 2006Chuan Sheng FooIn the beginning there was DNA…Liolios K, Tavernarakis N, Hugenholtz P, Kyrpides, NC. The Genomes On Line Database (GOLD) v.2: a monitor of genome projects worldwide. NAR 34, D332-334…then came protein interactionsArabidopsis PPI networkE. Coli PPI networkYeast PPI networkComparative Genomics to Comparative InteractomicsEvolutionary conservation implies functional relevanceSequence conservation implies functional conservationNetwork conservation implies functional conservation too!What new insights might we gain from network comparisons? (Why should we care?)Network comparisons allow us to:Identify conserved functional modulesQuery for a module, ala BLASTPredict functions of a modulePredict protein functionsValidate protein interactionsPredict protein interactionsOnly possible with network comparisonsPossible with existing techniques, but improved with network comparisonsWhat is a Protein Interaction Network?Proteins are nodesInteractions are edgesEdges may have weightsYeast PPI networkH. Jeong et al. Lethality and centrality in protein networks. Nature 411, 41 (2001)The Network Alignment ProblemGiven k different protein interaction networks belonging to different species, we wish to find conserved sub-networks within these networksConserved in terms of protein sequence similarity (node similarity) and interaction similarity (network topology similarity)Example Network AlignmentSharan and Ideker. Modeling cellular machinery through biological network comparison. Nature Biotechnology 24, pp. 427-433, 2006General Framework For Network Alignment AlgorithmsSharan and Ideker. Modeling cellular machinery through biological network comparison. Nature Biotechnology 24, pp. 427-433, 2006 Network constructionScoring functionAlignment algorithmCovered in lecture on network integrationTwo Algorithms Discussed TodayNetworkBLASTSharan et al. Conserved patterns of protein interaction in multiple species. PNAS, 102(6):1974-1979, 2005.Græmlin Flannick et al. Græmlin: General and robust alignment of multiple large interaction networks. Genome Res 16: 1169-1181, 2006.Overview ofSharan et al. Conserved patterns of protein interaction in multiple species. PNAS, 102(6):1974-1979, 2005.Estimation of Interaction ProbabilitiesIn the preprocessing step, edges in the network are given a reliability score using a logistic regression model based on three features:1. Number of times an interaction was observed2. Pearson correlation coefficient between expression profiles3. Proteins’ small world clustering coefficientNetwork Alignment GraphsConstruct a Network Alignment Graph to represent the alignmentNodes contain groups of sequence similar proteins from the k organismsEdges represent conserved interactions.An edge between two nodes is present if:1. One pair of proteins directly interacts, the rest are distance at most 2 away2. All protein pairs are of distance exactly 23. At least max(2, k – 1) protein pairs directly interactTries to account for interaction deletionsExample Network Alignment GraphNodesabca’b’c’a’’b’’c’’abca’b’c’a’’b’’c’’Network alignment graphIndividual species’ PPI networkSpecies X Species Y Species ZScoring FunctionSharan et al. devise a scoring scheme based on a likelihood model for the fit of a single sub-network to the given structureHigh scoring subgraphs correspond to structured sub-networks (cliques or pathways)Only network topology is scored, node similarity is notLog Likelihood Ratio ModelMeasures the likelihood that a subgraph occurs if it is a conserved network vs. that if it were a randomly constructed networkRandomly constructed network preserves degree distribution for nodeslogPr(Subgraph occurs | Conserved Network)Pr(Subgraph occurs | Random Network)Likelihood Ratio Scoring of a Protein Complex in a Single SpeciesU : a subset of vertices (proteins) in the PPI graphOU : collection of all observations on vertex pairs in UOuv : interaction between proteins u, v observedMs : conserved network modelMn: random network (null) modelTuv : proteins u, v interactFuv : proteins u, v do not interactβ : probability that proteins u, v interact in conserved modelpuv : probability that edge u, v exists in a random modelProbability of complex being observed in a conserved network modelProbability of subgraph being observed in a random network modelLikelihood Ratio Scoring of a Protein Complex in a Single SpeciesHence, log likelihood for a complex occurring in a single species is given byFor multiple complexes across different species, it is the sum of the log likelihoodsL(A, B, C) = L(A) + L(B) + L(C)Example of Complex ScoringNodesabca’b’c’a’’b’’c’’abca’b’c’a’’b’’c’’Conserved complex A in the Network alignment graphIndividual species’ PPI networkL(A) = L(X1) + L(Y1) + L (Z1)Complex X1 in Species XComplex Y1 in Species YComplex Z1 in Species ZAlignment algorithmProblem of identifying conserved sub-networks reduces to finding high
View Full Document