1Computational GenomicsComputational GenomicsBiological Networks & Biological Networks & Network EvolutionNetwork EvolutionEric XingEric XingLecture 21, April 3, 2007Reading: Expression networksRegulatory networksInteraction networksMetabolic networksNodes – molecules.Links – inteactions / relations.Molecular Networks2Disease Spread[Krebs]Social NetworkFood WebElectronicCircuitInternet[Burch & Cheswick]Other types of networksKEGG database: http://www.genome.ad.jp/kegg/kegg2.htmlMetabolic networks Nodes – metabolites (0.5K). Edges – directed biochemichalreactions (1K). Reflect the cell’s metabolic circuitry.3Barabasi & Oltvai. NRG. (2004) 5 101-113“Graph theoretic description for a simple pathway (catalyzed by Mg2+-dependant enzymes) is illustrated (a). In the most abstract approach (b) all interacting metabolites are considered equally.”Graph theoretic description of metabolic networksProtein Interaction Networks Nodes – proteins (6K). Edges – interactions (15K). Reflect the cell’s machinery and signlaing pathways.4Experimental approachesProtein coIPYeast Two-HybridGraphs and Networksz Graph: a pair of sets G={V,E} where V is a set of nodes, and E is a set of edges that connect 2 elements of V.z Directed, undirected graphsz Large, complex networks are ubiquitous in the world: z Genetic networksz Nervous systemz Social interactionsz World Wide Web5Global topological measuresz Indicate the gross topological structure of the networkConnectivity(Degree)Path length Clustering coefficient[Barabasi]Connectivity Measuresz Node degree: the number of edges incident on the node (number of network neighbors.)z Undetected networks z Degree distribution P(k): probability that a node has degree k.z Directed networks, i.e., transcription regulation networks (TRNs)Incoming degree = 2.1teach gene is regulated by ~2 TFsOutgoing degree = 49.8teach TF targets ~50 genesiDegree of node i = 56zLijis the number of edges in the shortest path between vertices i and jz The characteristic path length of a graph is the average of the Lijfor every possible pair (i,j)z Diameter: maximal distance in the network.z Networks with small values of L are said to have the “small world property”z In a TRN, L’ijrepresents the number of intermediate TFs until final target(, )2ijL =ijCharacteristic path length Path lengthStarting TFFinal target1 intermediate TF= 1Indicate how immediatea regulatory response isAverage path length = 4.7Clustering coefficient z The clustering coefficient of node iis the ratio of the number Eiof edges that exist among its neighbors, over the number of edges that could exist: CI=2TI/nI(nI-1)z The clustering coefficient for the entire network Cis the average of all the CiClustering coefficient4 neighbours1 existing link6 possible links= 1/6 = 0.17Measure how inter-connected the network isAverage coefficient = 0.117A Comparison of Global Network Statistics (Barabasi & Oltvai, 2004)P(k)~k−γ, k >> 1, 2 <γ!)(kkekPkk−=A. Random Networks [Erdos and Rényi (1959, 1960)]B. Scale Free [Price,1965 & Barabasi,1999] C.HierarchialMean path length ~ ln(k)Phase transition:Connected if:p≥ ln( k)/kPreferential attachment. Add proportionally to connectednessMean path length ~ lnln(k)Copy smaller graphs and let them keep their connections.Local network motifsz Regulatory modules within the networkSIM MIM FFLFBL[Alon]8YPR013CHCM1SPO1STB1ECM22[Alon; Horak, Luscombe et al (2002), Genes & Dev, 16: 3017 ]SIM = Single input motifsSBFHCM1SPT21MBF[Alon; Horak, Luscombe et al (2002), Genes & Dev, 16: 3017 ]MIM = Multiple input motifs9SBFYox1Tos8 Plm2Pog1[Alon; Horak, Luscombe et al (2002), Genes & Dev, 16: 3017 ]FFL = Feed-forward loopsMBFSBFTos4[Alon; Horak, Luscombe et al (2002), Genes & Dev, 16: 3017 ]FBL = Feed-back loops10randomlatticeStrogatz S.H., Nature (2001) 410 268What network structure should be used to model a biological network?11111112222223335446781 2 3 4 5 6 7 8degree connectivityfrequencyDegree connectivity distributions:Calculating the degree connectivity of a network11A. fulgidus(archaea)C. elegans(eukaryote)E. coli(bacterium)averaged over 43 organismsJeong et al. Nature (2000) 407 651-654Connectivity distributions for metabolic networks(color of nodes is explained later)\Jeong et al. Nature 411, 41 - 42 (2001)Wagner. RSL (2003) 270 457-466Protein-protein interaction networks12Strogatz S.H., Nature (2001) 410 268log degree connectivitylog frequencyayx=log frequencylog degree connectivityxya=Random versus scaled exponential degree distribution z Degree connectivity distributions differs between random and observed (metabolic and protein-protein interaction) networks.What is so “scale-free” about these networks?z No matter which scale is chosen the same distribution of degrees is observed among nodes13z Erdos-Renyi (1960)z Watts-Strogatz (1998)z Barabasi-Albert (1999)Models for networks of complex topologyz N nodesz Every pair of nodes is connected with probability p.z Mean degree: (N-1)p.z Degree distribution is binomial, concentrated around the mean.z Average distance (Np>1): log Nz Important result: many properties in these graphs appear quite suddenly, at a threshold value of PER(N)z If PER~c/N with c<1, then almost all vertices belong to isolated treesz Cycles of all orders appear at PER ~ 1/NRandom Networks: The Erdős-Rényi [ER] model (1960):14For p=0 (Regular Networks): • high clustering coefficient • high characteristic path lengthFor p=1 (Random Networks): • low clustering coefficient• low characteristic path lengthThe Watts-Strogatz [WS] model (1998)z Start with a regular network with Nverticesz Rewire each edge with probability pz QUESTION: What happens for intermediate values of p?WS model, cont.z There is a broad interval of p for which L is small but C remains largez Small world networks are common :15ER ModelER Model WS Modelactors power gridwww()~PK Kγ−Scale-free networks: The Barabási-Albert [BA] model (1999)z The distribution of degrees:z In real network, the probability of finding a highly connected node decreases exponentially with kz Two problems with the previous models:1. N does not vary2. the probability that two vertices are connected is uniformz The BA model:z Evolution: networks expand continuously by the addition of new vertices, andz Preferential-attachment (rich get richer): new vertices attach preferentially to sites that are already well connected.BA model, cont.16()iijjkkk∏=∑z
View Full Document