MIT HST 950J - Introduction to Systems Biology

Unformatted text preview:

Modeling and Reverse Engineering Genetic Networks - Introduction to Systems Biology Zoltan Szallasi M.D. Children’s Hospital Informatics Program, Harvard Medical School, Boston, MA Harvard-MIT Division of Health Sciences and TechnologyHST.950J: Medical ComputingPeter Szolovits, PhD Isaac Kohane, MD, PhD Lucila Ohno-Machado, MD, PhDGoals of science: Predictive power Understanding Intellectual entertainment Playground for adultsModeling: 1. Conceptual framework/Data about the system 2. Model structure (mathematical) 3. Pick the best model - parameter fitting 4. Model validationthe conceptual framework of genetic network analysisExample: Independently regulated derivatives of the c-jun geneGenetic network - heterogeneous network of interacting variables Cell (experimental unit) is a network of “gene derivatives” (mRNA, protein) and other biochemical entities. biological parameters: They can be defined as a biochemical entity, that: - can be measured - is chemically (rather) homogeneous -determines by itself or in combination with something else the state of another biological parameter.How many biological parameters ? Cautious estimate: on the order of 1-2x105 10,000-20,000 active genes per cell < 3 posttranslational modifications/protein in yeast 3-6 (?) posttranslational modifications/protein in humans The number of biological parameters is probably less than 10 times the number of genes Splice variants < > modulesCompartmentalization (!!!)reverse engineering of genetic regulatory networksThe more you know about the system - regulatory architecture/topology - actual parameters etc. …..the easier it is Even if you have a complete regulatory architecture you need to do some parameter fitting/testingThe Principle of Reverse Engineering of Genetic Regulatory Networks (Deterministic view): Determine a set of regulatory rules that can produce the gene expression pattern at T2 given the gene expression pattern at the previous time point T1 xi(t+1) = g (bi + Σwijxj(t)) t t+1Continuous modeling: (variations on a theme) i+ Σ jwijxj(t))xi(t+1) = g (bbasic assumption of most continuous approaches (Mjolsness et al, 1991 - connectionist model; Weaver et al., 1999, - weight matrix model; D’Haeseleer et al., 1999, - linear model; Wahde & Hertz, 1999 - coarse-grained reverse engineering) 1g(z) = - kz1 + eThe aim is to determine all the bi and wij values. - you need as many equations as variables 1. Genetic algorithms (Wahde & Hertz, 1999) 2. Solving weight matrices (singular value decomposition etc.) (Weaver et al., 1999) 3. Least square fit for the linear modeling (D’Haeseleer et al., 1999)Correlation matrices: (see Arkin, Shen & Ross, 1997) If a chemical reaction takes 1 unit of time, then the B A reaction will be a more likely candidate than the C A reaction to explain the time dependent changes in the figure above.Correlation matrices: (Arkin, Shen & Ross, 1997) Time lagged correlation matrix can be prepared based on equations: (1) Sij(τ) = <[xi(t)- xi][xj(t+τ)− xj]> Sij(τ)(2) rij (τ)= Sii(τ) Sjj(τ) <…..> : time average over all the measurements xi(t) : t-th time point of the time series generated for species i : time average of the i-th time series.x How much does a change in the level of species i correlate with a change τ time later in the level of species j ? iHow much information is needed for reverse engineering? Boolean fully connected 2N Boolean, connectivity K K 2K log(N) Boolean, connectivity K, linearly separable rules K log(N/K) Pairwise correlation log (N) N = number of genes K = average regulatory input/gene r unknown parameters in a set of ODEs 2r+1 (Sontag, 2002)P = K log(N/K) (John Hertz, Nordita) P : gene expression states N: size of network K: average number of regulatory interactions 1. Stochasticity (??????) 2. Size of network Nbic < 10 x Ngen about 1.2-fold increase in P (but definitely less than 2) 3. Connectivity (compartmentalization)- it will make thing easier ( it can reduce P) 4. Information content is 1-2 order of magnitude less: 10-100 fold increase in P.The useful information content of a gene expression matrix will depend on: 1. Measurement error (conceptual and technical limitations, such as normalization) 2. Kinetics of gene expression level changes (lack of sharp switch on/off kinetics - stochasticity ?) 3. Number of genes changing their expression level. 4. The time frame of the experiment. Applying all this to cell cycle dependent gene expression measurements by cDNA microarray one can obtain 1-2 orders of magnitude less information than expected in an ideal situation. (Szallasi, 1998)Reverse engineering using perturbations Perturbations on time and population averaged measurements Wagner, A. (2001) Ideker, T. ….Hood, L. (2001)perturbation matrix Knock-out A B C Gene A 0 1 1 expression B 0 0 1 C 0 0 0 accessibility matrix Regulator A B C A 0 0 0 Regulated B 1 0 0 C 1 1 0 A B C A B CStart with an already known topology if you can: Ideker et al (2001) – update the knowledge N. Friedman, Hartemink : Bayesian view of the network - May work well on subnetworks, - USE prior knowledge of topology !!!! P=0.9 A B C P=0.7genetic network modeling & systems biologyThe size of the network: Small scale - a few genes (N=1-3) Intermediate scale (N=10-100) Ensemble approaches (N=1000-100000) Principle of interactions between genes - stochastic - continuous differential eq. - step functions/Boolean networksSmall-scale genetic networks: Detailed computational and experimental analysis of a few genes Becskei & Serrano, (2000) stability of feedback loops Elowitz &Leibler (2000) synthetic oscillatory network Gardner…. J. Collins (2000) - genetic toggle switchDoes a feedback loop stabilize gene expression levels ? Becskei & SerranoIntermediate-scale genetic networks: Computational analysis of a 5 to 100 gene network (protein networks) Schoeberl et al, (2002) EGF receptor pathway Smith et al. (2002) analysis of the Ran regulated nucleocytoplasmic transport1) Overall topology of the network 2 ) Kinetic and other parametersVirtual cell Does the model produce time series results that fit the data ?Is the model robust ? How sensitive to the initial setting of parameters ?Can the model produce useful and testable hypothesis ?Further uses of studying robustness: Eldar …. Barkai, (2002)Comments – Suggestions: 1)Organize the model in a flexible way:


View Full Document

MIT HST 950J - Introduction to Systems Biology

Download Introduction to Systems Biology
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Introduction to Systems Biology and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Introduction to Systems Biology 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?