DOC PREVIEW
CMU CS 10810 - Lecture

This preview shows page 1-2-14-15-30-31 out of 31 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1Computational GenomicsComputational GenomicsNetwork AlgorithmsNetwork AlgorithmsEric XingEric XingLecture 22, April 10 & 12, 2007ReadingMining and analyzing networksz Identifying Signaling Pathwaysz color-coding technique (Alon, Yuster and Zwick. 1995) and generalizations (Scott et al. RECOMB 2005)z Identifying Interaction Complexes (clique-like structures)z Statistical subgraph scoring (Sharan et al. RECOMB 2004)z Network alignmentz PathBLAST: identify conserved pathways (Kelley et al 2003)z MaWISh: identify conserved multi-protein complexes (Koyuturk et al 2004)z Nuke: Scalable and General Pairwise and Multiple Network Alignment (Flannick, Novak, Srinivasan, McAdams, Batzoglou 2005)z Network Dynamicsz Sandy: backtracking to find active sub-network (Luscombe et al, Nature 2005)z Node function inferencez Stochastic block models (Aroldi et al, 2006)z Latent space models (Hoff, 2004)z Link predictionz Naïve Bayes classifier, Bayesian networkz MRF2observable observableParameters:timerates, selectionUnobservableEvolutionary PathobservableMRCA-Most Recent Common Ancestor?Time DirectionNetwork evolution3 Problems:1. Test all possible relationships.2. Examine unknown internal states.3. Explore unknown paths between states at nodes.Æ Network alignmentz Sequence alignment seeks to identify conserved DNA or protein sequencez Intuition: conservation implies functionalityEFTPPVQAAYQKVVAGV (human)DFNPNVQAAFQKVVAGV (pig)EFTPPVQAAYQKVVAGV (rabbit)z By similar intuition, subnetworksconserved across species are likely functional modulesMotivation3Network Alignmentz “Conserved” means two subgraphs contain proteins having homologous sequences, serving similar functions, having similar interaction profilesz Key word is similar, not identicalz Product graph:z Nodes: groups of sequence-similar proteins, one per species. z Edges: conserved interactions.mismatch/substitutionProtein groupsConserved interactions∏×=matched ,,,)random|Pr()similar|Pr()'(/)()',(vuvuvuSSCLCLCCLSimilarity (BLAST E-value)Scoring Schemez Given two protein subsets, one in each species, with a many-to-many correspondence between them, we wish:z Each subset induces a dense subgraph.z Matched protein pairs are sequence-similar.z Two hypothesis:z Conserved complex model: matched pairs are similar.z Random model: matched pairs are randomly chosen.4Scoring Scheme cont.z For multiple networks: run into problem of scoring a multiple sequence alignment.z Need to balance edge and vertex terms.z Practical solution: z Sensible threshold for sequence similarity.z Nodes in alignment graph are filtered accordingly.z Node terms are removed from score.PreprocessingInteraction scores: logistic regression on #observations, expression correlation, clustering coeff.Network alignmentSubnetwork searchFiltering & Visualizingp-value<0.01, ≤80% overlapConserved pathsConserved clustersProtein groupsConserved interactionsMultiple Network Alignmentz Two recent algorithms:z ???, Sharan et al. PNAS 2005z Nuke: Flannick, Novak, Srinivasan, McAdams, Batzoglou 20055hypotheticalancestralmoduledescendantsequivalenceclassesNuke: the modelz Example:S=SN+SE= 11.0 + 4.0logP(nodes |M)P(nodes | R)+ logP(edges |M)P(edges | R)2.54.0 1.53.00.80.4-0.40.81.2-0.30.60.50.6-0.2Nuke: Scoringz Probabilistic scoring of alignments:z M : Alignment model (network evolved from a common ancestor)z R : Random model (nodes and edges picked at random)z Nodes and edges scored independently6wijlogP(ni,nj)wijH. pyloriM. tuberculosis C. crescentus2 31E. coli4w12=w13=w14=0.50.250.25 w23= w24= w34=0.250.250.5Nuke: Scoring, cont.z Node scores: simplez Weighted Sum-Of-Pairs (SOP)z Each equivalence class scored as sum (over pairs ni, nj) of, where is weight on phylogenetic treePM(ni,nj)=P(BLAST score Sij| ni,njhomologous)PR(ni,nj) =P(BLAST score Sij)Nuke: Scoring, cont.z Alignment modelz Based on BLAST pairwise sequence alignment scores Sijz Intuition: most proteins descended from common ancestor have sequence similarityz Random modelz Nodes picked at random7Non-trivial tradeoff in pairwise alignment of full networksNon-trivial tradeoff in pairwise alignment of full networksNuke: Scoring, cont.z Edge scores: more complicatedz Edge scores in earlier aligners rewarded high edge weightsz But this biases towards clique-like topology!z Don’t want solely conservation eitherz This alignment has highly conserved (zero-weight) edges:ESMs: A New Edge-Scoring Paradigmz Idea: assign each node a label from a finite alphabet ∑, and define edge likelihood in terms of labels it connectsz During alignment, assign labels which maximize scorez E: Symmetric matrix of probability distributions, E(x, y) is distribution of edge weights between nodes labeled xand y8ESMs: A New Edge-Scoring Paradigmz Idea: assign each node a label from a finite alphabet ∑, and define edge likelihood in terms of labels it connectsz During alignment, assign labels which maximize scorez E: Symmetric matrix of probability distributions, E(x, y) is distribution of edge weights between nodes labeled xand yz Simplest case is clique ESMz 1x1 matrix: ∑ contains a single labelz Duplicates edge-scoring of aligners which search for cliquesESMs: A New Edge-Scoring Paradigmz For query-to-database alignment, use a module ESMz One label for each node in query modulez Tractable because queries are usually small (~10-40 nodes)z For each pair of nodes (ni, nj) in query, let E(i, j) be a Gaussian centered at cij= weight of (ni, nj) edge9ESMs: A New Edge-Scoring Paradigmz Multiple alignment gives us more information about conservationz Can iteratively improve ESM to adjust mean and deviation based on weights of edges between aligned pairs of query nodesz Easily implemented using kernel density estimation (KDE)A General Network Aligner: Algorithmz Given this model of network alignment and scoring framework, how to efficiently find alignments between a pair of networks (N1, N2)?z Constructing every possible set of equivalence classes clearly prohibitive10SeedExtendA General Network Aligner: Algorithmz Idea: seeded alignmentz Inspired by seeded sequence alignment (BLAST)z Identify regions of network in which “good” alignments likely to be foundz MaWISh does this, using high-degree nodes for seedsz Can we avoid such strong topological constraints?d-Clusters: Intuitionz “Good” alignments typically have:z a significant number of nodes with high sequence similarityz Implied by the node scoring function, which prefers aligning


View Full Document

CMU CS 10810 - Lecture

Download Lecture
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?