New version page

Duplication Models for Biological Networks

Upgrade to remove ads

This preview shows page 1-2-24-25 out of 25 pages.

Save
View Full Document
Premium Document
Do you want full access? Go Premium and unlock all 25 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 25 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 25 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 25 pages.
Access to all documents
Download any document
Ad free experience

Upgrade to remove ads
Unformatted text preview:

Duplication Models for Biological Networks+AbstractDuplicationFigure 2: A graph, shown on the left, can be described as a set of orbits (middle diagram) connected as shown on the right. C is the adjacency matrix of the right graph,  is the vector of the occupation numbers. The right most panel shows how the vertices can be grouped into the 4 orbits of the graph, thus the  vector is simply (4, 4, 1, 2).Duplication Models for Biological Networks+Fan Chung *, Linyuan Lu*, T. Gregory Dewey† and David J. Galas†+*Department of Mathematics,University of California at San DiegoLa Jolla, California, 92093†Keck Graduate Institute of Applied Life Sciences535 Watson Drive,Claremont, California 91711Appeared in Journal of Computational Biology, 10, no. 5, (2003), 677-688.+ corresponding author, [email protected], phone: 909 607-7487, Fax: 909 607-85981AbstractAre biological networks different from other large complex networks? Both largebiological and non-biological networks exhibit power-law graphs (number of nodeswith degree k, N(k) ~ k ) yet the exponents, fall into different ranges. This maybe because duplication of the information in the genome is a dominant evolutionaryforce in shaping biological networks (like gene regulatory networks and protein-protein interaction networks), and is fundamentally different from the mechanismsthought to dominate the growth of most non-biological networks (such as the internet).The preferential choice models non-biological networks like web graphs can onlyproduce power-law graphs with exponents greater than 2. We use combinatorialprobabilistic methods to examine the evolution of graphs by duplication processes andderive exact analytical relationships between the exponent of the power law and theparameters of the model. Both full duplication of nodes (with all their connections) aswell as partial duplication (with only some connections) are analyzed. We demonstratethat partial duplication can produce power-law graphs with exponents less than 2,consistent with current data on biological networks. The power-law exponent for largegraphs depends only on the growth process, not on the starting graph.2Networks of interactions are fundamental to all biological systems. The interactionsamong species in ecosystems, the interactions between cells in an organism, and amongmolecules in a cell are all parts of complex biological networks. There is considerablecurrent interest in networks within the cell - genetic regulatory networks and protein-protein interaction networks, in particular - about which we can now acquire extensivedata using new technological advances. The duplication of the information in thegenome - genes and their controlling elements – is a central force in evolution andshould be determinative of biological networks. The process of duplication is quite different from the mechanisms thought to dominatethe growth of most non-biological networks (such as the internet, social or citationnetworks [Barabasi et al., 1999, Barabasi et al., 1999, Albert et al., 1999, Albert et al.,1999, Lu 2001]), which involve the simple addition of nodes with preferentialconnection to existing nodes. These latter processes only produce power-laws graphswith exponents greater than 2 [Barabasi et al., 1999, Barabasi et al., 1999, Albert et al.,1999, Albert et al., 1999, Strogatz 2001, Aiello et al., 2000]. A power-law graph is onein which the number of nodes of degree k (the number of edges impinging on a vertex),N(k), has a distribution that follows a power-law: N(k) ~ k-. We present newmathematical results here on the evolution of graphs by different duplication processes.Using a combinatorial-probabilistic approach to analyze both the full duplication ofnodes (with all their connections) as well as partial duplication (with only some of theirconnections), we find that full duplication retains a strong “memory” of the startinggraph - certain topological properties of the starting graph are conserved underduplication - while breaking the parent-daughter symmetry of the process by partialduplication induces non-conservation of this property and causes some “memory” of thestarting graph to be lost. We find that full duplication does not produce power-lawgraphs, but partial duplication does. For partial duplication the power-law exponentdepends, as the graph grows without bound, only on the growth process, and not on thestarting graph. 3A survey of existing results on scaling of large networks shows a striking differencebetween biological and non-biological networks. Biological networks often haveexponents that are between 1 and 2; that is, 1<  < 2 [Aiello et al., 2000) The non-bioiological networks , on the other hand, have exponents that commonly range from 2 to4 or more (see Table I.). While it is difficult to draw a strong conclusion from thislimited observation it does raise the question as to whether biological networks evolvedifferently. The non-biological networks have been convincingly modeled withpreferential accretion of nodes [Barabasi et al., 1999, Barabasi et al., 1999, Albert et al.,1999], but these cannot explain exponents of power-law graphs less than 2. A seminal idea in molecular evolution is that through gene duplication biologicalinformation is coopted, or “reused”, for different purposes [Ohno 1970]. This notionrecognizes that the information in biomolecules, selected over hundreds of millions ofyears, represents a rich starting point for many useful modifications. This “reuse” occursby the duplication and subsequent mutation of genes and other genetic elements,including both genes and cis regulatory sequences. The recent availability of genomicsequence information from a wide range of organisms provides abundant evidence of thewidespread occurrence of gene duplication and the validity of the early hypotheses ofOhno and others. There is strong evidence, for example, that the genome of firsteukaryote ever sequenced, that of a model organism, the yeast Saccharomyces ceriviciae(baker’s yeast), is the result of an almost complete genome duplication in the distant past[Stubbs 2002, Friedman et al., 2002]. There


Download Duplication Models for Biological Networks
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Duplication Models for Biological Networks and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Duplication Models for Biological Networks 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?