Scientific paper Lin-log Model of E. coli Central Metabolism __V Ana Tu{ek and @elimir Kurtanjek* University of Zagreb, Faculty of Food Technology and Biotechnology, Pierottijeva 6, 10000 Zagreb, Croatia * Corresponding author: E-mail: zkurt@pbf.hr Fax: +385 1 4836 083, Tel: +385 1 4605 294 Received: 19-06-2009 Dedicated to the memory of the late Prof. Dr. Valentin Koloini Abstract Mathematical models of dynamics of metabolic pathways are used for analysis of complex regulations of biochemical reactions as an intrinsic property of a metabolism. The models are derived under assumptions of kinetic rate functions and usually result in simplification in view of the model theoretical scope and/or its practical application. The main obstacle in kinetic modeling is the dimensionality of the parametric space, its nonlinearity and ill-conditioned relations for kinetic parameter estimation. In this work these problems are effectively resolved by use of an approximate linear-logarithmic (Lin-log) applied in analysis of regulation of Escherichia coli central metabolism. Complex multiplicative Mic-haelis-Menten kinetic rate expressions are transformed into simple in parameter linear functions and non-linear logarithmic dependencies on concentrations of substrates, and cofactors. The Lin-log kinetic rates enable direct estimation of rate elasticities which are the key parameters in metabolic control analysis (MCA). Due to in the parameter linearity, the estimation problem is solved in a non-iterative least square algorithm. Applied is singular value decomposition (SVD) algorithm for system matrix pseudoinversion with the eigenvalue cut-off threshold at 0.01. The results are presented as parameters of enzyme activities and reaction elasticities. Evaluated activities and elasticities provided insight into the fluxes regulation. Comparison of the simulation results by Lin-log and Michaelis-Menten model reveals that errors are of the same order of magnitude. Keywords: Lin-log kinetics, E. coli central metabolism, metabolic control analysis, rate elasticity 1. Introduction Biochemical networks of industrial microorganism metabolisms are very complex and highly regulated. For better understanding of the structure, regulation and interconnectivity of metabolic networks, mathematical models of biochemical system need to be developed. A mathematical model of a metabolic network includes reaction rates for each reaction, kinetic parameters of reactions and mass balances for each metabolite based on stoichiome-tric matrix. In metabolic control analysis (MCA) the kinetic models are used for prediction of changes of enzyme activities or enzyme concentrations on specific fluxes and metabolite level.1 Most of kinetic models are based on differential equations which are solved with appropriate numerical integration method during simulations of perturbed unsteady metabolism.2 Mass balances of metabolites can be solved in time course (dynamic models) or in steady state (state in which metabolite concentrations are at constant level).3 Kinetic parameters of enzyme catalyzed reactions are usually based on experimental data gathered by tests on purified enzymes in test tubes (in vitro experiments). However, conditions in living cells are usually much more complex compared to those in the test tubes what leads in un-appropriate models based on in vitro experimental data. To overcome the problem in modeling and MCA analysis experiments are conducted by rapid high throughput on-line metabolite intracellular concentration measurements. In order to automate kinetic modeling in simultaneously with experiment automation, from a view of the general Biochemical Systems Theory (BST) derived are several "new" enzyme kinetic models which are amenable for modeling automation.4 For example, some of them are: linear model, logarithmic-linear, linear-logarithmic (Lin-log), power law Generalized Mass Action (GMA), power law Synergy Systems (S-systems), thermokinetic,).1 The Lin-log model has important theoretical background based on thermodynamic concept of steady state in reaction systems.5 Upon perturbation of a steady state rates driving force of reactions are proportional to reaction affinities which are linear functions of concentrations of reacting species. Onsager derived the so-called phenomenological relations in which reaction rates are proportional to corresponding affinities, i.e. proportional to logarithms of reactant concentrations.6 This is an asymptotic result which is theoretically valid only for infinitesimal perturbations around an equilibrium, but in applications has been extended to analysis of transient experiments with finite perturbations. From practical (computational) point of view, advantage of approximate kinetics is that it allows, in some cases, analytical solutions of network fluxes and metabolite concentration as a function of enzyme level.1,3 Approximate kinetics has some other desired properties: the conversion rate is proportional to the corresponding enzyme level, minimum number of parameters are used, and analytical solution of steady state balances are possible. These advantages enable simplified metabolic control analysis, such as determination of flux "bottlenecks" and possible network reorganization and optimization. In this work are presented attempts to apply lin-log kinetic formalism for transient data far from equilibrium obtained in a fully automated on-line intracellular response of E.coli central metabolism upon glucose impulse.7,8 This experiment is performed far from a cell steady state, i.e. under drastic changes from a glucose starvation to a step impulse in glucose saturation. The results have modeled by complex Michaelis-Menten kinetics, D. Dege-nring,7,8 and improved model was optimized by AI methods (Genetic Algorithm, Differential Evolution, Simulated Annealing), S. Ceric and Z. Kurtanjek.9 The model is given by 24 highly regulated biochemical reactions, 132 kinetic parameters, describing the glucose consumption for energy and cell building blocks. It includes glycolysis, pentose phosphate pathway, Ent-ner-Doudoroff pathway and synthesis of cell building blocks. Reaction rates included into the model are described with the multiplicative Michaelis-Menten kinetics: (1) were c, c1 and c2 are metabolites, vmax is maximal reaction rate and K are half saturation constants for the metabolites. 1. 2. Lin-log Kinetics The rate of enzyme catalyzed reaction is in general proportional to enzyme activity e, dependent intracellular metabolites and cofactors x, and independent extracellular metabolite c concentrations and kinetics parameters K.1,10 This dependency is highly non-linear and can be approximated by linear-logarithmic (Lin-log) kinetic rate in a simple, in parameter linear maner:11,12 (3) where n is the number of dependent and m is number of independent concentrations. The rate is proportional to the enzyme level (e), while (a, pi and c;) are termed as the Lin-log kinetic parameters. In eq. (3) dependent and independent metabolite concentrations are given by xi and ci. In the Lin-log format all the rate equations have the same mathematical structure: proportionality of rate of enzyme activity, non-linearity in concentrations and linearity in the parameters.3 In this form the number of kinetic parameters is reduced; only one parameter can be assigned to one metabolite (concentration). Lin-log format is usually written in normalized format. In metabolic systems normalization is usually based on steady (reference) state, before the perturbation is introduced. Reaction rate in a reference state is defined by reference flux r0, and extracellular and intra-cellular metabolite level c0, x0. In those conditions the effects of concentrations on the reaction rate are defined as elasticites e0 for dependent and independent metabolites: (4) (5) Including elasticities and reference state in the eq. (3) the following expression in a matrix form is derived for a network of reactions:10,11 (6) where i is the unit vector, Ex0 and Ec0 are matrices containing elasticites for intracellular and extracellular metabolites, x/x0 and c/c0 are vectors of dependent and independent metabolites and e/e0 is matrix containing relative enzyme activities. Elasticity es are calculated using expression: _ fjfrY Es r° 'UsJ (7) (2) For example, Michaelis-Menten kinetic rate in Lin-log approximation becomes: As in most of approximate kinetic models, Lin-log is not suitable for all range of concentrations. Asymptotic values of the Lin-log rates for very low and very high concentrations deviate from zero rate and saturation rate. However, when applied for intracellular processes in living cells, concentrations are close to homeostatic conditions and the limiting situations do not occur. 2. Experimental Lin-log approach was applied on 24 reaction rates included into the model of E. coli central metabolism upon glucose impulse. The metabolic network is presen- ted in Fig. 1. The dynamic model is a system of ordinary differential equations (ODE-s) which are composed of mass balances of the metabolites. The cofactors are experimentally measured and are included into the balances as polynomial interpolations. By Lin-log approximation the number of parameters is reduced from 132 for the original Michaelis-Menten model to 86 parameters of enzyme activities and reaction elasticities. For example, the most complicated kinetic expression included into the model is the one describing the kinetics of pyru-vate kinase (enzyme that catalyzes the transfer of phosphate group form phosphoenolpyruvate to ADP, yielding one molecule pyruvate and one molecule ATP), given here: Fig. 1 The biochemical network applied in the model of E. coli response upon glucose impulse. This kinetic expression includes 14 kinetic parameters (V k, K , , K.P ,, K ADP ,, K.PEP k, K PEP ,, V ,, v /,pk' eg,pk ' z,Pyr,pk' m,ADP,pk' z,PEP,pk' m,,PEP,pk' r,pk' K ATP k, K P ,, K. ADP ,, K. G6P ,, K. C5P ,, n ,,, nk2). m,ATP,pk rn,Pyr,pk' i,ADP,pk i,G6P,pk i,C5P,pk pk,P pk,27 Using lin-log approximation reaction rate for pyruvate ki-nase kinetic is transformed into: by the corresponding algorithms provided by Wolfram Research Mathematical For this case, with 10 metabolites, and 30 intervals of integration, the dimensions of the system eq. (12) are: Y =Y(300 x 1), A = A(300 x 86), and b = b(86 x 1). The system is over-determined, there are Number of the parameters is in this way reduced from 14 to 7 and non-linear expression is transformed to linear function of the parameters which is easy for estimation and analysis. The system of balances is integrated in intervals of 0.5 s., from ti-1 to ti yielding a set of linear equations for the unknown parameters b: (11) The linear model eq. (11) reflects the network topology and the preselected Lin-log kinetic terms. The elements of the matrix A are obtained by multiplication of the network stoichiometric matrix N and the vector obtained by corresponding interval integrations of time profiles for each intracellular metabolite: Yr=x(/, )-*(/,J (12) The interval integrals are calculated by numerical integration of the interval polynomial interpolation of the second order over the discrete set of experimental data. The interpolation and numerical integrations are evaluated 300 equations with 86 unknowns, and can be solved in the sense of the least squares method (LS). In a case when the "covariance" matrix is non-singular, the minimum variance yields a unique solution: b = (a' ■ a) ' ■ A' - Y (13) and in this work pseudoinversion method is applied, i.e. when a case of poorly conditioned systems often occurs, a solution should be found based on singular value decomposition (SVD) with account of elimination of "small" and insignificant eigenvalues which have a pronounced effect on amplification of numerical and measurement errors on parameter estimation. In order to cut-off the insignificant eigenvalues, the tolerance condition, defined as the ratio between the acceptable smallest and the maximal eigenvalue, is used provided by Wolfram Research Mathe-matica algorithm for Pseudolnverse algorithm15: b = PseudoInverse[A, Tolerance —> /?]■ Y (14) The tolerance is the ratio between the larges and the smallest eigenvalue included in the SVD algorithm. Here is applied the value of /3= 0.01 based on a reason that numerical accuracy is for a factor better than experimental error in data. The obtained solution minimizes the Euclidian norm of the model error: b' - minllY - A -b|L b (15) Alternatively, the same modeling approach can be set in the frame of decoupled equations with provided estimates of the derivatives rather than interval integrals (Voit and Almeida, 2004).17 This leads to a significant simplification in parameter estimation, especially for large biochemical networks, but the numerical burden of parameter estimation and instability is shifted from estimation kinetic parameters to estimation of derivates. 3. Results and Discussion Results of the Lin-log model are depicted in Fig. 2-6. and the parameters are given in Table 1. In calculation of the parameters, the enzyme activities and elasticities, the median values of the corresponding intracellular concentrations are taken as the reference states. In discussion are included results obtained by A. Tusek and Z. Kur-tanjek on the global parametric sensitivity of the Mic-haels-Menten model based on Fourier Amplitude Sensitivity Test (FAST) method.16 3.5 3.0 | 2.0 £ 1.S 1.0 0,5 0.0 i 't ft 'V * X 1 i 1 V * V. t f 1 • ^ j L. . V * • • ••• •. M 15 10 - 10 0.0 t/s Fig. 2. Glucose-6-phosphate (G6P) response to the external glucose impulse (G). The extracellular impulse is depicted with a dashed line, the experimental data for intracellular glucose-6-phosphate are given by symbols and the Lin-log model prediction by a continuous curve. Table 1. Evaluated elasticities of the fluxes derived from Lin-log kinetics. Linlog rate r / mmol L 1 s 1 Elasticities E/ s -1 A1 = 1.35106 £pl\r = 0.6817 Ep's = _1 736 G6P pts '-PEP ' : -0.0782 lin log r A2 = 0.40706 eGLC = -0.4269 Epgi F6P -2.85709 A3 = 1.48104 eF% = 0.043449 Epfk -0.87217 Epfk -0.19324 -- -0.274505 f = -0.1278905 jtofié^^l+egJ-.i^^jj A4 = -1.2104 E r - A [ I + ./.,f I s '"{gap0) + t"AD XnaD") ADP CdP0)) A7 = 1.34993 £oir = 0.60028 , gappep , '-NAD ' 0.3639603 E gappep_ ADP ~ 0.29365 A„ _ 0.491559 £p& _ 0.25725 Epyk - G6P 1.90026 cpyk -C5P _ -1.608228 ■■ -0.01727 zpykD _ -1.38766 ipyk _ -0.2027 A, 1.12469 . pepck PEP _ -0.61872 I _ 0.68556 PBP PEP FBP PEP In Fig.2. are given the experimental data for the extracellular glucose step impulse from glucose starvation to a saturation and the intracellular glucose 6-phosphate (G6P) response during first 12 seconds. Scatter of the experimental data is significant due to complexity and interference effects involved in on-line intracellular assay. The result implies initially very high activity of phosphotransferase system (PTS) leading to fast establishment of intracellular glucose equilibrium at average concentration of 1 mmol L-1. Prior to the impulse, for "negative" time the intracellular concentration is practically zero, and after the impulse saturation is established during the first second. PTS is activated by extracellular glucose, yielding £p£LC = 0.6817 s-1. The negative feedback mechanism for PTS regulation is due to the role of G6P which is revealed by the high negative elasticity £p£66P = -1.736 s-1. The result £pGsEP = -0.0782 s-1 shows negligible effect of phosphoenolpyruvate (PEP) on PTS, which could be explained that PEP needed for PTS is readily available by reactions from the network metabolites, supported by the initial high concentration of PEP 1.1 mmol L-1 prior to perturbation by the impulse, Fig. 5. Responses of F6P, the experimental and the lin-log model, are depicted in Fig. 3. There is a time delay of 0.5 s. after the perturbation during the first 2 s. the concentration shoots up to the maximum level of 1.4 mmol L-1, which afterwards oscillatory relaxes to average 1.4 mmol L-1. The influx to F6P pool, depicted as r2 in Fig. 1., is regulated through negative feedback of the external glucose and F6P. The dominant effect is due to F6P implied by the estimated corresponding elasticities ep(giC = -0.4269 s-1 and £pf6p = -2.85709 s-1. The oscillatory behavior inferred from the scatter of the experimental data is also predicted by the lin-log model. 1.4 1.2 1.0 0.8 0.6 0.4 0.2 • • • • * . • . V * • • / / • * 6 t/s 10 12 Fig. 3. Fructose-6-phosphate response to the glucose impulse. The experimental data are depicted by symbols and the Lin-log model prediction by a curve. The response of fructose1,6-biphosphae (FBP), presented in Fig. 4., shows qualitatively different behavior. Initial concentration is at maximum level 2.7 mmol L-1 is decreasing to a lower 2.3 mmol L-1. The time delay in the perturbation is more pronounced, and is about 2.5 s. Based on the FAST analysis is concluded that the key control is exerted by PEP on phosphofructokinase (PFK). However, the Lin-log model implies distributed negative feedback effects by F6P, FBP, PEP, ATP, and ADP, with the corresponding elasticities: £PpF)P = 0.043449 s-1, ePFF)P = -0.87217 s-1, ePpfkEP = -0.19324 s-1, £pfP = -0.27450F s-1, and ePpD)P = -0.1278905 s-1. The reverse flux is positively regulated by FBP with eFBFPPe = 1.06718 s-1. Fig. 4. Fructosebiphosphat response to the glucose impulse. The experimental data are depicted by symbols and the Lin-log model prediction by a curve. The immediate response to the impulse by PEP is shown in Fig. 5. Decrease from the initial PEP concentration 1.1 mmol L-1 to the minimum of 0.1 mmol L-1 occurs during PTS, and afterwards relaxes through oscillatory response to a previous level. Qualitatively and numerically the Lin-log model predicts the same behavior. The in-flux to PEP pull, denoted as r7 in Fig. 1, is positively activated by GAP, NAD and ADP with correspon- , gappep 'NAD ding elasticities: egpppep = 0.60028 s-0.3639603 s-1, and efppep = 0.29365 s-1. The second influx to PEP is revealed through negative enzyme activity Fig. 5. Phosphoenolepyruvate response to the glucose impulse. The experimental data are depicted by symbols and the Lin-log model prediction by a curve. A23 = -1.12469 of the flux from oxaloacetate OAA to PEP. It is due to the modeling assumption of a single nonreversible flux toward TCA cycle. However, due to the external glucose perturbation the sudden needed for PEP to facilitate PTS reverses the flux, leading to the negative estimation of the enzyme activity. The negative value should be interpreted as a product of the negative stoic-hiometric coefficient and the positive activity. The flux is positively activated by OAA and negatively regulated by PEP with the estimates for elasticities e pepck = 0.68556 s-1, e= -0.61872 s-1. However, the model limitations are illustrated by inadequate predictions of concentration of C5P pool, Fig. 6. The model fails to predict the oscillatory behaviour and the errors between the experimental data and the model show systemic error. One of the possible reasons for the negative results is due to the large amplitudes which contradict the fundamental property of lin-log model being limited to relatively small variations. In other words, Lin-log kinetics is applicable for modeling kinetic responses to limited perturbations around steady states leading to prediction of a behavior which is represented as a combination of exponential decay patterns. For prediction of the observed oscillatory response essential nonlinear kinetics and/or transport delays possibly need to be accounted in a model. 2.0 1.8 1.6 1.4 1.2 1.0 0.8 •• • ** * • . * / * • • • • • • • • • « • • * - s • • 6 t/s 10 Fig. 6. C5P pool response to the glucose impulse. The experimental data are depicted by symbols and the Lin-log model prediction by a curve. Comparison of the prediction errors by Lin-log and Michaelis-Menten model916 leads to conclusion that the both models have large errors, average relative error is in order of about 25%. Most of the experimental data dispersions and deviations from the model predictions are due to sampling and analytical errors during the rapid on-line intracellular assay. However, the main advantage of the method is the numerical simplicity of Lin-log and direct evaluation of elasticities which provides a significant advantage over models based on nonlinear Michael-Meneten kinetics. 4. Conclusions Presented is a method of the Lin-log modeling for a complex metabolic network. The models are expressed by the enzyme activities and elasticities of the metabolic fluxes. The balances are transformed to a set of linear equations which are solved by the singular value decomposition. In order to reduce the sensitivity of estimates of the model parameters to experimental and numerical errors applied is pseudo inversion with a cut-off eigenvalue threshold at the level of 0.01. The proposed procedure is illustrated by the Ling-log modeling of E. coli central metabolism transients under large perturbation during a shift from glucose starvation to glucose saturation by a step impulse. Although the level of perturbation is large, the Lin-log modeling captured the main features of the transients, especially for the key glycolytic metabolites. However, the model failed to predict response of the relatively "distant to glycolysis" metabolites, for example for the C5P pool. The model errors are greatly affected by dispersion of the experimental data due to errors in sampling and on-line rapid intracellu-lar composition measurements. The main advantages of Lin-log models are in simplicity of the modeling procedure and direct evaluation of enzyme activities and flux elasticities. Based on the estimated elasticities are determined the key flux regulation metabolites. The presented methodology may be potentially applied for rational planning of restructuring of metabolic networks by genetic engineering. 5. References 1. J. J. Heijnen, Biotechnol. Bioeng. 2005, 91(5), 534-545. 2. P. Mendes, D.B. Kell, Bioinformatics 1998, 14, 869-883. 3. I. E. Nikerel, W.A. van Gulik, J.J. Heijnen, Simulation News Europe, 2007, 17(1), 19-26. 4. M.A. Savageau, J. Theor. Biol. 1969, 25, 365-369. 5. J. Nielsen, Biochem. J. 1997, 58, 125-132. 6. L. Onsager, Phys. Rev. 1931, 37, 405-426. 7. D. Degenring, C. Fromel, G. Dikata, R. Takors, J. Process Control 2004, 14, 729-745. 8. D. Degenring. Erstellung und Validierung mechanistischer Modelle für mikrobiellen Stoffwechsel zur Auswertung von Substrat-Puls-Experimenten. Dissertation. University of Rostock, 2004. 9. S. Ceric, Z. Kurtanjek, Chem. Biochem. Eng. Q. 2006, 20(3), 243-253. 10. H. Link, D. Weuster-Botz, Metab. Eng. 2007, 9, 433-441. 11. M. T. A. P. Kresnowati, W. A. van Winden, J.J. Heijnen, Metab. Eng. 2005, 7, 142-153. 12. J. J. Heijnen, W.M. van Gulik, H. Shimizu, G. Stephanopou-los, Metab. Eng. 2004, 6, 391-400. 13. L. Wu, W. Wang, W. A. van Winden, W. M. van Gulik, J. J. Heijnen, Eur.J.Biochem. 2004, 271, 3348-3359. 14. D. Visser, J. J. Heijnen, Metab. Eng. 2003 J, 164-176. 15. Wolfram Research Matematica, 2008, v. 6.1. 16. A. Tusek, Z. Kurtanjek (I. Troch, F. Breitenecker, Eds), Proceedings of 6th Vienna International Conference on Mathe- matical Modelling, ARGESIM Publishing House, Vienna, 2009, 1178-1184. 17. E. O. Voit, J. Almeida, Bioinformatics, 2004, 20(11), 16701681. Povzetek Za analizo kompleksne regulacije biokemijskih reakcij kot bistvene lastnosti metabolizma se uporabljajo matematični modeli ki opisujejo dinamiko metabolnih poti. V raziskavah analize regulacije centralnega metabolizma Escherichia co-li smo uspešno uporabili aproksimativni linearno-logaritmični (Lin-log) kinetični model, ki omogoča direktno določitev elastičnosti hitrosti reakcije, ki je ključni parameter pri analizi metabolne regulacije (metabolic control analysis MCA). Rezultati so predstavljeni v obliki vrednosti encimskih aktivnosti in elastičnosti hitrosti. S pomočjo teh dveh parametrov je bil omogočen vpogled v regulacijo metabolnih fluksov. Primerjava rezultatov simulacije z Lin-log in Michaelis-Men-ten modeloma sta pokazala, da so odstopanja istega reda velikosti.