Stanford CS 374 - Lecture 6 Networks of Protein Interactions - D2936054

Home> Schools> Stanford University> Computer Science (CS) > CS 374> Lecture 6 Networks of Protein Interactions

DOC PREVIEW

Stanford CS 374 - Lecture 6 Networks of Protein Interactions

School name Stanford University

Course Cs 374- Algorithms in Biology

Pages 38

This preview shows page 1-2-3-18-19-36-37-38 out of 38 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 38 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 38 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 38 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 38 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 38 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 38 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 38 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 38 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 38 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Networks of Protein Interactions Network AlignmentRecapMotivationSlide 4Network AlignmentEarlier approaches: interologsEarlier approaches: PathBLASTSlide 8Earlier approaches: MaWIShSlide 10A General Network Aligner: GoalsSlide 12A General Network Aligner: ModelSlide 14A General Network Aligner: ScoringSlide 16Slide 17Slide 18ESMs: A New Edge-Scoring ParadigmSlide 21Slide 22A General Network Aligner: AlgorithmSlide 24d-Clusters: Intuitiond-ClustersSlide 27Slide 28Extending seedsSlide 30Multiple AlignmentResulting AlignmentsPowerPoint PresentationSlide 34Slide 35Comparison to Extant MethodsPairwise Full NetworkPairwise Query-to-DatabaseMultiple Alignment (3-way)Networks of Protein InteractionsNetwork AlignmentAntal NovakCS 374Lecture 610/13/2005Nuke: Scalable and General Pairwise and Multiple Network AlignmentFlannick, Novak, Srinivasan, McAdams, Batzoglou (2005)RecapNetwork IntegrationCombine data from multiple sources to obtain robust probabilities of interactionCan be performed in a high-throughput manner“Whatcha gonna do with it?”Network alignment!Sequence alignment seeks to identify conserved DNA or protein sequenceIntuition: conservation implies functionalityEFTPPVQAAYQKVVAGV (human)DFNPNVQAAFQKVVAGV (pig)EFTPPVQAAYQKVVAGV (rabbit)MotivationBy similar intuition, subnetworks conserved across species are likely functional modulesMotivationNetwork Alignment“Conserved” means two subgraphs contain proteins serving similar functions, having similar interaction profilesKey word is similar, not identicalmismatch/substitutionEarlier approaches: interologsInteractions conserved in orthologsOrthology is a fuzzy notionSequence similarity not necessary for conservation of functionGoal: identify conserved pathways (chains)Idea: can be done efficiently by dynamic programming if networks are DAGsKelley et al (2003)DD’+ matchEarlier approaches: PathBLASTCX’+ mismatchB+ gapAA’Score: matchProblem: Networks are neither acyclic nor directedSolution: eliminate cycles by imposing random ordering on nodes, perform DP; repeat many timesIn expectation, finds conserved paths of length L within networks of size n in O(L!n) timeDrawbacksComputationally expensiveRestricts search to specific topologyKelley et al (2003)Earlier approaches: PathBLAST1 4 2352 1 4535 2 134Goal: identify conserved multi-protein complexes (clique-like structures)Idea: such structures will likely contain at least one hub (high-degree node)Koyuturk et al (2004)Earlier approaches: MaWIShAlgorithm: start by aligning a pair of homologous hubs, extend greedilyKoyuturk et al (2004)Efficient running time, but also only solves a specific caseEfficient running time, but also only solves a specific caseEarlier approaches: MaWIShA General Network Aligner: GoalsSolve restrictions of existing approachesShould extend gracefully to multiple alignment•PathBLAST was extended to 3-way alignment, but extension scales exponentially in number of speciesShould not restrict search to specific network topologies (cliques/pathways)Must be efficient in running timeA General Network Aligner: GoalsUseful application for biologists: given a candidate module, align to a database of networks (“query-to-database”)Query: Database:Earlier approaches aligned pairs of nodesInstead, alignment as an equivalence relation: equivalence class consists of proteins evolved from a common ancestral proteinCan contain multiple proteins in same species (paralogs)Handles multiple alignment in an obvious way{paralogA General Network Aligner: ModelExample:hypotheticalancestralmoduledescendantsequivalenceclassesA General Network Aligner: Model€ S = SN+ SE= 11.0 + 4.0Probabilistic scoring of alignments:M : Alignment model (network evolved from a common ancestor)R : Random model (nodes and edges picked at random)Nodes and edges scored independently€ logP(nodes | M)P(nodes | R)+ logP(edges | M)P(edges | R)2.54.0 1.53.00.80.4-0.40.81.2-0.30.60.50.6-0.2A General Network Aligner: ScoringNode scores: simpleWeighted Sum-Of-Pairs (SOP)•Each equivalence class scored as sum (over pairs ni, nj) of , where is weight on phylogenetic tree€ wijlog P(ni,nj)€ wijH. pyloriM. tuberculosis C. crescentus2 31E. coli4€ w12=w13=w14=0.50.250.25 w23= w24= w34=0.250.250.5A General Network Aligner: ScoringAlignment model•Based on BLAST pairwise sequence alignment scores Sij•Intuition: most proteins descended from common ancestor have sequence similarity• Random model•Nodes picked at random• € PM(ni,nj) = P(BLAST score Sij| ni,nj homologous)€ PR(ni,nj) = P(BLAST score Sij)A General Network Aligner: ScoringEdge scores: more complicatedEdge scores in earlier aligners rewarded high edge weights•But this biases towards clique-like topology!Don’t want solely conservation either•This alignment has highly conserved (zero-weight) edges:Non-trivial tradeoff in pairwise alignment of full networksNon-trivial tradeoff in pairwise alignment of full networksA General Network Aligner: ScoringIdea: assign each node a label from a finite alphabet ∑, and define edge likelihood in terms of labels it connectsDuring alignment, assign labels which maximize scoreE: Symmetric matrix of probability distributions, E(x, y) is distribution of edge weights between nodes labeled x and yESMs: A New Edge-Scoring ParadigmFor query-to-database alignment, use a module ESMOne label for each node in query module•Tractable because queries are usually small (~10-40 nodes)For each pair of nodes (ni, nj) in query, let E(i, j) be a Gaussian centered at cij = weight of (ni, nj) edgeESMs: A New Edge-Scoring ParadigmMultiple alignment gives us more information about conservationCan iteratively improve ESM to adjust mean and deviation based on weights of edges between aligned pairs of query nodes•Easily implemented using kernel density estimation (KDE)ESMs: A New Edge-Scoring ParadigmGiven this model of network alignment and scoring framework, how to efficiently find alignments between a pair of networks (N1, N2)?Constructing every possible set of equivalence classes clearly prohibitiveA General Network Aligner: AlgorithmIdea: seeded alignmentInspired by seeded sequence alignment (BLAST)Identify regions of network in which “good” alignments likely to be found•MaWISh does this, using high-degree

View Full Document