Unformatted text preview:

Biological Networks CMSC 423 BISCI 648a Suggested Reading Paper Lists http www cs umd edu class fall2007 cmsc858l http www cs umd edu class fall2009 cmsc858l papers html Reviews 1 2 3 4 5 Zhu Gerstein Snyder Getting connected analysis and principles of biological networks Genes Dev 21 1010 1024 2007 Shoemaker and Panchenko Deciphering Protein Protein Interactions Part I Experimental Techniques and Databases PLoS Comp Biol 3 3 e42 2007 Kelley Sharan Karp Sittler Root Stockwell Ideker Conserved pathways within bacteria and yeast as revealed by global protein network alignment Proc Natl Acad Sci USA 100 11394 11399 2003 Sharan Ulitsky Shamir Network based prediction of protein function Mol Syst Biol 3 2007 Milo Shen Orr Itzkovitz Kashtan Chklovskii Alon Network motifs simple building blocks of complex networks Science 298 824 827 2002 Types of Biological Networks Yeast transcription network Yeast Phosphorylation network E coli metabolic network Yeast Protein Protein interaction network Yeast SSL network Zhu et al 2007 Transcription network aka regulatory network Transcription Factors proteins that bind to DNA to activate or repress the nearby downstream genes the regulated gene might also be a transcription factor regulates gene gene leads to a directed graph Protein Protein interaction network Proteins physically interact A B Edge A B Assumption of binary interactions is imperfect Sometimes several proteins must bind simultaneously for there to be any interaction could be modeled as a hyperedge Metabolic network Proteins are enzymes They label the edges and their substrates are the nodes Valine Leucine and Isoleucine biosynthesis from KEGG Experimental Interaction Protein Yeast aka Saccharomyces cerevisiae interaction network Two hybrid GRID http thebiogrid org 8 742 edges 3113 nodes proteins out of 6 000 genes Human interaction network Two hybrid 6 434 edges 4 083 nodes out of 25 000 genes Why proteins interact A A B E D E B C C D A B C Signal Transduction Bring chains of enzymes together A B C D Tethered Signal Transduction Form structures From Analysis of Biological Networks Junker and Schreiber eds How do we extract biological insight from these graphs 1 What role does each protein play in the cell 2 How do we uncover the true graph from noisy samples 3 How do we compare interaction graphs across species 4 What are the characteristics of interaction graphs 1 Function Prediction Proteins with known function network topology function assignment for unknown proteins Guilt by association Simple Majority Rule Doesn t take into account connections between neighbors Or annotations at distance one Neighboring Proteins More Likely to Share Function Yu et al 2008 2 Predicting Missing Edges Completing Defective Cliques Yu et al 2006 P Q P Q both adjacent to all nodes in clique there are two n 1 cliques that overlap by n 2 nodes likely that P Q should be adjacent 3 Aligning Networks Let G1 V1 E1 G2 Gk be graphs each giving noisy experimental estimations of interactions between proteins in organisms 1 k If Gi Vi Ei we also have a function sim u v Vi Vj R that gives the sequence similarity between u and v G1 G2 PathBLAST Kelley et al 2003 PathBLAST Alignment Graph Nodes correspond to homologous pairs A a where A is from one species and a is from the other Edges come in 3 types Direct A B and a b interactions are present Gap Edge A B is present and a b are separated by 2 hops Mismatch Both A B and a b are both separated by 2 hops Kelley et al 2003 PathBLAST Scoring Function Pr interaction p Ev H estimated from Ev distributions in COG p v probability that proteins in v are really homologs p v Pr Homology Ev S P v P P is a path in the alignment graph log 0 1 2 0 3 3 0 9 q e Pr i i e q e product of interaction edges contained within the alignment edge sum over logs product over scores 1 p v prandom e P log q e qrandom prandom and qrandom are the average values of p v and q e in the graph PathBLAST Search Procedure If G is directed acyclic DAG then its easy to find a high scoring path via dynamic programming S v L max scoring path of length L that ends at v p v q u v S v L arg max S u L 1 log log prandom qrandom u pred v Because G is not directed acyclic they randomly create a large number of DAGs by removing edges as follows 1 Randomly rank vertices 2 Direct edges from low to high rank Run dynamic program on the random DAGs and take the highest scoring path 2 L chance that a path will be preserved So repeat 5L times H pylori S cerevisiae Find several 50 high scoring paths Then remove those edges vertices and repeat Overlay the identified paths Revealed 5 conserved pathways Contains proteins from both DNA polymerase and Proteosome evidence that they interact Kelley et al 2003 H pylori S cerevisiae Find several 50 high scoring paths Then remove those edges vertices and repeat Overlay the identified paths Revealed 5 conserved pathways Contains proteins from both DNA polymerase and Proteosome evidence that they interact Kelley et al 2003 H pylori S cerevisiae Find several 50 high scoring paths Then remove those edges vertices and repeat Overlay the identified paths Revealed 5 conserved pathways Contains proteins from both DNA polymerase and Proteosome evidence that they interact Kelley et al 2003 Some Notes Goal use a well studied organism yeast to learn about a lessstudied organism H pylori There were only 7 directly shared edges between yeast H pylori you would expect 2 5 shared edges Within conserved pathways proteins often were not paired with the protein with the most similar sequence Gap mismatch edges were essential 22 of the proteins in previous figure did not pair with their best sequence match Single pathways in bacteria often correspond to multiple pathways in yeast Yeast is suspected of having undergone multiple whole genome duplications 4 Over represented Network Motifs Are there connection patterns that occur frequently A Frequently more often than you d expect in a random graph with the same degree distribution Milo et al 2002 found the feed forward motif over represented in gene regulation networks Di culty is larger motifs subgraph isomorphism is NP hard B C Experimental Techniques to Determine Protein Interactions Slow accurate costly X ray crystallography NMR High throughput but noisy Yeast Two Hybrid TAP MS tandem affinity purification mass spec Yeast Two Hybrid Inside Yeast Transcription Factor GAL4 DNA DNA Binding Domain Domain functional evolutionary conserved unit of a


View Full Document

UMD CMSC 423 - Biological Networks

Documents in this Course
Midterm

Midterm

8 pages

Lecture 7

Lecture 7

15 pages

Load more
Loading Unlocking...
Login

Join to view Biological Networks and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Biological Networks and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?