- Research
- Open Access
- Published:

# The effect of model rescaling and normalization on sensitivity analysis on an example of a MAPK pathway model

*EPJ Nonlinear Biomedical Physics*
**volume 4**, Article number: 3 (2016)

## Abstract

### Background

The description of intracellular processes based on chemical reaction kinetics has become a standard approach in the last decades, and parameter estimation poses several challenges. Sensitivity analysis is a powerful tool in model development that can aid model calibration in various ways. Results can for example be used to simplify the model by elimination or fixation of parameters that have a negligible influence on relevant model outputs. However, models are usually subject to rescaling and normalization to reference experiments, which changes the variance of the output. Thus, the results of the sensitivity analysis may change depending on the choice of these rescaling factors and reference experiments. Although it might intuitively be clear, this fact has not been addressed in the literature so far.

### Methods

In this study we investigate the effect of model rescaling and additional normalization to a reference experiment on the outcome of two different sensitivity analyses. Results are exemplified on a model for the MAPK pathway module in PC-12 cell lines. For this purpose we apply local sensitivity analysis and a global variance-based method based on Sobol sensitivity coefficients, and compare the results for differently scaled and normalized model versions.

### Results

Results indicate that both sensitivity analyses are invariant under simple rescaling of variables and parameters with constant factors, provided that sensitivity coefficients are normalized and that the parameter space is appropriately chosen for Sobol’s method. By contrast, normalization to a reference experiment that also depends on parameters has a large impact on the results of any sensitivity analysis, and in particular complicates the interpretation.

### Conclusion

This work shows that, in order to perform sensitivity analysis, it is necessary to take into account the dependency on parameters of the reference condition when working with normalized model versions.

## Background

Ordinary differential equations are the most commonly used mathematical approach to describe the dynamics of intracellular signaling pathways. They are often based on chemical reaction kinetics, and standard ways to describe reaction rates exist. Several models have been proposed for different signaling pathways, and analysis methods have been developed and applied for further model investigation. Many of these models are scaled and normalized to a predefined reference condition. Appropriate rescaling can simplify the model and for example remove non-identifiable parameters (see e.g. [1]). Normalization is often needed to compare models to any kind of normalized experimental data. A normalization of experimental data is generally required in all cases where measured outcomes do not allow to extract absolute amounts, but only values that are assumed to be proportional to these amounts. This is quite often the case, including for example Western blot or FACS data. Western blotting is a technique to quantify protein amounts and their activity states. Detection works via the quantification of light signals from antibodies that specifically bind to the protein under study. These light signals have to be normalized in a two-step procedure in order to enable a comparison between different replicates. In a first step, signals are normalized to a loading control, in order to minimize artifacts that are due to different loading amounts. Second, since raw signals depend on the specialties of the antibodies, the particular membranes and chemicals in use, they are usually additionally normalized to a reference condition (for examples see [2, 3]). This second normalization is required for a comparison of data from different replicates. Since Western blotting is also used more and more frequently for a quantitative analysis, several studies are involved with the experimental protocols, testing of linear ranges, and proper normalization procedures [4, 5]. In this work we use the term rescaling whenever model parameters and variables are multiplied by constant factors, which is often applied to obtain dimensionless models. In contrast, normalization refers to conditioning data to a reference experiment, though both terms are often used as synonyms in other references. From a modeling point of view, rescaling and normalization must sometimes be treated differently, since the reference experiment usually also depends on parameters, and in particular, is itself subject to variance.

Although normalized models are omnipresent all over, the effect of normalization on model calibration and analysis has not been well-investigated so far, and is also poorly understood. From a modeling point of view, the effect of normalization in a statistical framework for state estimation has been investigated [5–7]. Results indicate that normalization might also have a crucial effect on parameter estimation.

Here we consider the effect of rescaling and normalization on sensitivity analysis. Sensitivity analysis is one of the most important tools in model development and can for example be used for model reduction, calibration, validation, robustness analysis or the design of experiments. This type of model analysis is widely applicable in various scientific fields such as engineering, physics, economy, social sciences and many more. Some nice examples from different applications are illustrated in [8]. Various mathematical definitions of sensitivity functions exist, with different methods for their computation. However, the common basic idea is to quantify the variation of an output of a mathematical model due to variations of some input quantity, such as for example a model parameter or an initial condition. In some cases, the output of interest is a time-invariant function of the input, for example the steady state of the system. When considering dynamical systems, described for example with ODE models, the output of interest is generally the whole trajectory of the state variable, rendering the analysis more challenging. An introduction into sensitivity theory for continuous- and discrete-time dynamical systems is provided in [9], with a particular focus on linear systems. This book is mainly interesting for control engineering applications, since it defines sensitivity functions in time and frequency domains, and investigates optimal control systems. Zi et al. [10] and Kim et al. [11] instead provide good reviews about the application of different sensitivity analysis methods in systems biology, including advices for toolboxes and implementations and its role for model development. Further application of global sensitivity methods on different biologically inspired example models are described in [12] and [13], the former with a special focus on computational efficiency. Kent et al. [13] define a sensitivity based robustness measure, which is evaluated on five different models, including also a model for the steady states in the MAPK signaling module.

In this study, we exemplarily investigate the impact of rescaling and normalization to a reference experiment on sensitivity analysis for a toy model describing a simple reversible reaction to illustrate results and, as a real world case study, a model of the MAPK signaling pathway module in PC-12 cell lines. For the latter we use a model that was calibrated to Western blot data from an experimental study in Santos et al. [14]. Parameter estimation was done via a sampling-based Bayesian approach. For this study we use the maximum-a-posteriori (MAP) estimator as a point estimate. We compare results of local and global, variance-based sensitivity analysis methods on three model versions. The native model version A describes the dynamics of activities of proteins that take part in the signaling cascade. In model B all variables are rescaled to their respective total protein amounts, which are assumed to be conserved. Thus variables represent fractions of total amounts. Finally, in order to compare this model to Western blot data, the model output was additionally normalized to a reference condition, which defines model variant C. This last model version was the one used in [15] for parameter estimation. We compare the results of local and global, variance-based sensitivity analysis on these three model variants. We decided to use the Sobol sensitivity analysis method [16, 17], since it is one of the most general methods that, different from other methods, does not rely on monotone or even linear input/output relationships [10]. Moreover, Sobol sensitivity indices have been shown to highly correlate with other sensitivity measures, such as indices from Extended Fourier Amplitude Sensitivity Tests (FAST) and Partial Rank Correlation Coefficients (PRCC), indicating a kind of robustness of these measures [18]. They are furthermore recently pointed out as advantageous in other respects as well in connection with pharmacology models [19].

The paper is structured as follows. We start deriving general results for the effect of rescaling and normalization on local and global sensitivity analysis. These are compared and discussed for all three model versions of the toy model, which partly allows to illustrate effects by analytic calculations. Then we introduce the ODE modeling approach for the MAPK signaling pathway module and discuss numerically obtained results for this case study. Details of the sensitivity methods can be found in the Methods section.

## Results and discussion

### The effect of rescaling and normalization on sensitivity analysis

#### Local sensitivity analysis

Local sensitivity analysis investigates the influence of a parameter *p* on a model output *y*
_{
i
}(*t*,*p*) around a reference parameter set *p*
^{∗}. Sensitivity coefficients are formally defined as first-order partial derivatives of the model output *y*
_{
i
} with respect to the parameter *p*
_{
j
},

In our analysis we consider the normalized sensitivity coefficients,

The coefficients *s*
_{
ij
}(*t*) are invariant under rescaling of model parameters and variables. This can easily be verified. We consider two models 1 and 2 that only differ in their scales. For simplicity we consider model 1 to have a single output *y*(*t*,*p*) that depends on a single parameter *p*. Both are rescaled in model version 2, i.e.
$\tilde p=bp$
and
$\tilde y(t,\tilde p)=ay(t,p)$
. The normalized sensitivity
$\tilde s(t)$
of model 2 about a reference parameter value
$\tilde p^{*}=bp^{*}$
reads

which equals the normalized sensitivity coefficient *s*(*t*) of model version 1.

Normalization of the model output to a (parameter dependent) reference experiment at time *t*
^{∗},

leads to the following normalized sensitivities *s*
^{′}(*t*):

Thus, the local sensitivity coefficients *s*
^{′}(*t*) are shifted by the respective local sensitivity *s*(*t*
^{∗}) of the reference experiment. Hence all sensitivity coefficients become zero at the reference conditions, i.e. *s*
*ij*′(*t*
^{∗})=0.

In summary, normalized local sensitivity coefficients are invariant under rescaling of model variables and parameters. Additional normalization to a parameter dependent reference experiment shifts the sensitivity courses by the local sensitivity coefficient of the reference experiment. Hence the sensitivity coefficient becomes zero at the reference experiment. Positive values $s^{\prime }_{ij}(t)$ indicate that the relative change of the respective concentration exceeds that of the reference experiment, while negative values indicate that the relative change in the reference experiment is larger. In order to interpret these results in terms of the total concentrations, one has to take the sensitivity value of the reference experiment into account.

#### Variance-based global sensitivity analysis

Variance-based sensitivity analysis decomposes the variance of the output *Y* due to variations in the input parameters into contributions from different inputs. Here we exploit the Sobol sensitivity analysis method, shortly Sobol method, which can be applied to any non-linear differential equation model. Generally, this method decomposes the variance of each output into a sum of 2^{k}−1 terms, *k* denoting the number of influential parameters, that describe the contribution of each possible parameter subgroup to the total variance. A more detailed mathematical description of this method is provided in [16, 17] and is recapitulated in the Methods section. Computations can drastically be reduced by only considering the so-called first order and total effect sensitivity indices *S*
_{
i
} and
$S_{T_{i}}$
, *i*=1,…,*k*. The first order indices quantify the contribution of variations in parameter *P*
_{
i
} only to the total output variance, while
$S_{T_{i}}$
, on the other side of the spectrum, is the overall effect of parameter *P*
_{
i
}, in contribution with variations of all possible combinations of the other parameters. Thus, *S*
_{
Ti
}≥*S*
_{
i
}, and the difference quantifies the interaction of parameter *P*
_{
i
} with the other model parameters. Furthermore, *S*
_{
Ti
},*S*
_{
i
}∈[0,1], and *S*
_{
Ti
}=0 implies that *P*
_{
i
} has no effect at all on the output, while *S*
_{
Ti
}=1 indicates that the output variance can completely be assigned to the variance in the factor *P*
_{
i
}.

In order to show the effect of rescaling on these two sensitivity indices, we use the variance-based definition, which reads for *S*
_{
i
}

In this approach, parameters *P* and outputs *Y* are random variables, and *Y* is a short notation for a single output variable *y*
_{
j
}(*t*,*p*) at a particular time point *t*.
$\mathbb {E}_{\sim P_{i}}(Y|P_{i}=p_{i}^{*}))$
denotes the expectation value of *Y* when varying all parameters except *P*
_{
i
} (∼*P*
_{
i
}:=*P*∖*P*
_{
i
}), which is fixed to a value
$p_{i}^{*}$
.
$\operatorname {Var}_{P_{i}}$
is the variance of this expected value when varying *P*
_{
i
} in a predefined range. *S*
_{
i
} is
$\operatorname {Var}_{P_{i}}$
normalized to the total variance Var(*Y*) in *Y*. The Sobol method assumes that all parameters *P*
_{
i
} are independent and uniformly distributed random variables,
$P_{i}\sim U[{p_{i}^{l}},{p_{i}^{u}}]$
. If upper and lower bounds are appropriately chosen, *S*
_{
i
} is invariant under rescaling of parameters and variables. To show this, we consider again the first-order sensitivity index of the output
$\tilde Y=aY$
of a rescaled model version with parameters
$\tilde P = bP$
:

From these calculations we see that $\tilde S_{i} = S_{i}$ if we choose $\tilde P_{i}\sim U[\!b{p_{i}^{l}},b{p_{i}^{u}}]$ . Invariance of the total order index can be shown accordingly. Using

we get

The change of the Sobol indices caused by normalization to a parameter dependent reference experiment is generally more difficult. Using *Y*
^{′}=*Y*/*Y*
^{r}, where *Y*
^{r} denotes the reference experiment, and the formal definition of *S*
_{
i
}, we get

Thus, $S^{\prime }_{i}$ and $S^{\prime }_{Ti}$ contain expectation values and variances of ratios of random variables, which cannot further be resolved in the general case. Hence, such a normalization to a reference experiment might generally change the outcome of this sensitivity analysis completely. Furthermore, since ratio distributions can be difficult to handle, and in particular, moments might even not be defined at all [20, 21], a reference experiment normalization can considerably complicate this kind of sensitivity analysis. At least, convergence has to be checked carefully for a Monte Carlo implementation of the method.

### Case study I: sensitivity analysis for a simple reversible reaction

For illustration purposes, we first consider the effect of rescaling and normalization on a simple reversible reaction,

which is described via mass action kinetics and can be solved analytically. State variables of model A correspond to absolute protein concentrations. For this model version and initial condition $x^{A}(0)={x^{A}_{0}}$ we get

Model version B is obtained via rescaling the state variable to ${x^{A}_{0}}$ , i.e. $x^{B}(t) = {x^{A}(t)}/{{x^{A}_{0}}}$ ,

with parameters
${k_{1}^{B}}={k_{1}}/{{x^{A}_{0}}}$
and
${k_{2}^{B}}=k_{2}$
. For model version C we consider the case of additional normalization of the output variable to the state of the system at a reference time point *t*
^{∗}=1,

#### Case study I: local sensitivity analysis

Normalized local sensitivities
$s_{k_{1}}(t)$
of model outputs with respect to the parameter *k*
_{1} can be calculated analytically for all three model variants,

It can easily be seen that $s^{A}_{k_{1}}=s^{B}_{{k_{1}^{B}}}$ . Moreover, $s^{C}_{{k_{1}^{B}}}$ is obtained via shifting $s^{B}_{{k_{1}^{B}}}$ by the sensitivity $- s^{B}_{{k_{1}^{B}}}(1)$ of the reference experiment. Figure 1 illustrates these results. We also remark that ${k_{2}^{B}}$ remains the only influential parameter for model C in case of ${x_{0}^{A}}=0$ .

#### Case study I: global sensitivity analysis

In order to illustrate the effect of rescaling and normalization for the outcome of the global sensitivity analysis, we consider the case of uniformly distributed parameters *k*
_{1},*k*
_{2}∼*U*(1,2). In the following, we focus our analysis on steady state sensitivity, i.e.
$y^{A}=\bar y^{A}={k_{1}}/{k_{2}}$
. The resulting probability density
$f_{Y^{A}}\left (y^{A}\right)$
of this output can be derived by geometrical arguments and reads

with variance

The first order Sobol indices

will in the following be calculated analytically. We start with
$s^{A}_{k_{1}}$
. The measure
$\mathbb {E}_{k_{2}}(Y^{A}|K_{1}=k_{1}^{*})$
denotes the expected value of *Y*
^{A} for a fixed value
$k_{1}^{*}$
and is obtained via a density transformation. Setting
$y^{A}=g(k_{2})={k_{1}^{*}}/{k_{2}}$
, which is strictly monotonically decreasing, we get

The expectation value $\mathbb {E}_{k_{2}}(Y^{A}|K_{1}=k_{1}^{*})$ and $\operatorname {Var}_{k_{1}}\left (\mathbb {E}_{k_{2}} (Y^{A}|K_{1}=k_{1}^{*}) \right)$ can be derived from this density via

and

which gives $s_{k_{1}}^{A}\approx 0.4675$ .

The index $s_{k_{2}}^{A}$ is obtained in the same way: Setting $g(k_{1})={k_{1}}/{k_{2}^{*}}$ , we get

which gives

and

Thus, we obtain $s_{k_{2}}^{A}\approx 0.5135$ . The interaction effects can be extracted via

which leads to total effect indices

This analysis is visualized in Fig. 2. Overall, this analysis shows that varying the parameter *k*
_{2} has a slightly higher impact on the output variance than variations of the parameter *k*
_{1}, though both contributions are of the same order of magnitude. Furthermore, the interaction effect of both parameters is small, and the total effect indices are not much larger than the respective first order indices.

Sobol sensitivity indices are the same for model version B, provided that ${k_{1}^{B}}={k_{1}}/{{x_{0}^{A}}}$ is sampled from ${k_{1}^{B}}\sim U\left (1/{x_{0}^{A}},2/{x_{0}^{A}}\right)$ .

The steady state output of model version C reads

Again ${k_{1}^{B}}$ is sampled from ${k_{1}^{B}}\sim U\left (1/{x_{0}^{A}},2/{x_{0}^{A}}\right)$ . Since a completely analytical treatment analogous to model version A is difficult here, Sobol indices were calculated via Monte Carlo simulations, as illustrated in Fig. 3. Interestingly, the Sobol indices of the normalized model version C are very different from model versions A and B, namely

Figure 3 shows that the shape of the density
$f_{Y^{C}}\left (y^{C}\right)$
is different from
$f_{Y^{A}}\left (y^{A}\right)$
(bottom left). Furthermore, as can be seen from the Sobol indices, most of the variance of *Y*
^{C} is attributed to variations in the parameter
${k_{2}^{B}}$
, while the parameter
${k_{1}^{B}}$
has only a marginal influence. Also the interaction effect between both parameters is not very large. These results are reflected in the Figures on the right hand side: The remaining variance in *Y*
^{C} when fixing
${k_{1}^{B}}$
at a certain value (top right) is much higher than the respective variance when fixing
${k_{2}^{B}}$
(bottom right), and this is true for all possible values of
${k_{1}^{B}}$
and
${k_{2}^{B}}$
. Moreover, while the mean value
$\mathbb {E}_{Y^{C}|k_{1}^{B,*}}\left (Y^{C}\right)$
does hardly change as a function of
$k_{1}^{B,*}$
,
$\mathbb {E}_{Y^{C}|k_{2}^{B,*}}\left (Y^{C}\right)$
highly varies as a function of
$k_{2}^{B,*}$
, resulting in a small first order Sobol index
$s_{{k_{1}^{B}}}^{C}$
and a large Sobol index
$s_{{k_{2}^{B}}}^{C}$
.

Overall, results on this simple toy model illustrate the effect of rescaling and normalization on sensitivity analysis.

### Case study II: a model for the MAPK module in PC-12 cell lines

The Mitogen-activated protein kinase (MAPK) cascade is a conserved signaling module that is part of various signaling pathways. It is a three-tired phosphorylation cascade, which involves the proteins Raf, MEK and ERK. Raf is activated by Ras upon stimulation, which then triggers the double phosphorylation of MEK. Phosphorylated MEK in turn phosphorylates and thereby activates ERK, which also requires double phosphorylation to become fully active. ERK has a lot of substrates that regulate different cellular fates. The MAPK pathway is a well investigated signaling module from an experimental and a modeling point of view [22–24]. It can show a rich variety of different behaviors such as oscillations, ultrasensitivity, or bistability and has been investigated in different contexts.

Specificity in the response of the MAPK module to different ligands, which ensures a reliable processing of signals, is achieved through different courses in ERK activity, which in turn regulate ERK substrate activation. In particular, the MAPK module involves several feedback regulations, which are important to shape ERK response. Most importantly, ERK interacts with Raf via different mechanisms in a context dependent manner. This has been exemplified in a study with PC-12 cell lines, in which the MAPK signaling pathway was investigated upon stimulation with epidermal growth factor (EGF) and neuronal growth factor (NGF) [14]. PC-12 cells show a transient ERK activity after stimulation with EGF, and cells start to proliferate. In contrast, ERK activity is sustained for at least one hour after stimulation with NGF, and NGF triggers differentiation.

Here we analyze a model that has been calibrated to experimental data from Santos et al. [14] and involves a context-dependent feedback term from ERK to Raf. Details of the modeling process are described in [15]. In this model, mass action kinetics is used to describe the phosphorylation and dephosphorylation reactions. Feedback from ERK to Raf is described in a non-linear way. Assuming mass conservation for the total amounts of proteins in the cascade,

allows to eliminate the variables Raf, MEK and ERK. The resulting model has four state variables, which correspond to pRaf, ppMEK, pERK and ppERK. For our sensitivity analysis procedure, we focus on the response of the system to stimulation with NGF in the control case, which allows to simplify the model in [15] accordingly. The model structure is shown in Fig. 4 a.

Model variant A is an unnormalized version, whose state variables correspond to the actual amounts of these four proteins. The ODE model corresponding to model version A is shown in Fig. 4 b.

Reaction rate constants are denoted by
$k^{(+/-)}_{i}$
and
$\tilde k^{(+/-)}_{i}$
. The input *u*(*t*) mimics transient Ras activation upon stimulation and is described by a sigmoidally decreasing function. The positive feedback from ERK to Raf is described by a Hill-type function. The Hill coefficients were set to *m*=5 and *M*=3.

For this model version we define the outputs

where $\tilde p$ denotes the vector of parameters of the system.

This model was calibrated to Western blot data, which provide light intensities that are scaled to the signal of the respective total protein. Hence the measured signals are proportional to the fraction of phosphorylated protein concentrations relative to the total protein amounts,

The factors of proportionality *α*
_{
i
} account for differences in binding affinities of the antibodies and variations in membranes. Since we only have measurements for pRaf, ppMEK and ppERK, but not for the intermediate product pERK, we set without loss of generality *α*
_{3}=1. The transformed system reads

Compared to model version A, also some of the rate constants had to be rescaled, in particular,

This defines model B, whose variables correspond to quantities proportional to the fractions of phosphorylated proteins. Hence the output variables of model B are

where *p* denotes the vector of rescaled parameters, obtained from
$\tilde {p}$
.

The scaling factors *α*
_{
i
} are unknown and can be very different for different proteins and different replicates. In order to get rid of these factors and to enable a comparison between different experimental replicates, data and model are normalized to a reference experiment. Here we adapted our choice of reference experiment to the data in Santos et al. [14], where the signals at *t*=5 min were set to one individually for each protein. Hence measurements were compared to the third set of model outputs

which define model C. The model parameters, which we use here for our analysis, correspond to the MAP estimate in [15] and are listed in Table 1.

#### Case study II: local sensitivity analysis

Figure 5 shows the normalized local sensitivity coefficients *s*
_{
ij
}(*t*) for model variants A and B, which have been calculated via the direct differential method, as explained in the Methods section. Since the system is monotone (i.e. all feedback circuits are positive and the system has a monotone flow [25]) and hence the courses are similar for all three components, the results are shown for ppERK only. The phosphorylation rates
$k_{1}^{+},k_{2}^{+},k_{3}^{+}$
and
$k_{4}^{+}$
all have a positive effect, which is transient in case of
$k_{1}^{+}$
and shows a transient behavior, followed by a second increase towards *t*=60 min for
$k_{2}^{+},k_{3}^{+},k_{4}^{+}$
. This reflects the fact that the system shows a quasi-bistable behavior, meaning that the system is monostable but able to maintain a state different from zero for a very long time upon a transient signal and returns to its steady state at zero only at a later time point (for more details on this phenomenon see [15]). Similarly, all dephosphorylation rates have negative sensitivity coefficients, which become most influential at later time points, since they inversely regulate the duration of sustained response. As expected, the rate constant *k*
_{
Fp
} and the threshold parameter *g* in the non-linear feedback term have positive and negative influences, respectively, that are increasing over time.

Overall, the results of this local sensitivity analysis are plausible given the model structure. The results in particular show that the early transient behavior of the cascade is mainly determined by the phosphorylation rates. Moreover, the time point at which trajectories return to their steady state is very sensitive to changes in most parameters,
$k_{1}^{+}$
and *K* being the only exceptions.

Figure 6 shows respective results for model version C with *t*
^{∗}=5 min. Indeed, the courses are equivalent to those of model versions A and B, but shifted by −*s*(*t*
^{∗}). This shift causes sign changes for all sensitivity coefficients. For example, in the first picture of the second row in Fig. 6, where *i*=3 and *j*=5, referring to
$p_{j} = k_{1}^{-}$
,
$s^{\prime }_{ij}(t)>0\) for *t*<*t*
^{∗} and
$s^{\prime }_{ij}(t)<0$
for *t*>*t*
^{∗}. This means that the relative change in *y*(*t*,*p*) when varying *p* around *p*
^{∗} is larger than the relative change in the reference value *y*(*t*
^{∗},*p*) for *t*<*t*
^{∗}, and vice versa for *t*>*t*
^{∗}. This generally renders the interpretation of the sensitivity coefficients more difficult. In particular, it is not possible to extract the effect on the unnormalized concentrations from these courses without additionally taking the value *s*
_{
ij
}(*t*
^{∗}) itself into account.

#### Case study II: Variance-based global sensitivity analysis

Figure 7 shows the results of Sobol’s sensitivity analysis for model variants A and B. We have chosen $P_{i} \sim U[\!0.1p_{i}^{*},10p_{i}^{*}]$ . First order indices are depicted in the first row, total effect indices are shown in the second row, for pRaf (left), ppMEK (center) and ppERK (right). It can be seen that first order indices decrease over time for all variables, while total effect indices increase at the same time, indicating that interaction effects between parameters gain importance over the duration of the signal response.

Looking at the first order indices, the number of influential factors increases from Raf to MEK, which is naturally expected, since the signal mainly propagates in this direction. Consequently, pRaf is mainly influenced by the rate constants of its own dephosphorylation, followed by it’s phosphorylation rate
$k_{1}^{+}$
, which is more dominant in the beginning of the response, where it determines the speed of Raf activation. Raf activity is furthermore weakly dependent on the feedback strength *k*
_{
Fp
}. In addition, the phosphorylation and dephosphorylation rates of MEK come into play in the first order Sobol sensitivities of MEK. Finally, the course of ppERK is influenced by a mixture of rate constants of Raf, MEK and ERK phosphorylation and dephosphorylation.

The values of the sums of total effect indices, which are shown in the second row, are much larger than the first order indices, indicating that interaction effects have an important impact on the overall variance of the model outputs. Compared to the first order effects, some new parameters appear, such as the parameter *K*, which had little effect as first order indices. This indicates that *K* influences the model output mainly via interactions with other parameters. While Raf is still mainly influenced by its dephosphorylation rate, the number of influential parameters increases from Raf to MEK, and ERK activity is regulated by a mixture of various total effect indices.

Since pERK is a hidden variable that is not observed, its phosphorylation and dephosphorylation rates $k_{3}^{+}$ and $k_{3}^{-}$ have only a marginal influence on model outputs. This is probably also due to the fact that pERK is an intermediate product between inactive non-phosphorylated and fully active, double phosphorylated ERK, which often has a buffering role and makes the overall system less sensitive to changes in e.g. total protein concentrations.

As outlined above, Sobol sensitivity indices are generally different for model versions B and C. The analysis results for model version C with reference time point *t*
^{∗}=5 min are shown in Fig. 8. As expected, the picture is indeed completely different from Fig. 7, showing that normalization of the variables has in fact a large impact on sensitivity analysis. Most strikingly, normalization seems to cause large interaction effects between almost all parameters in this case study. Furthermore, all three observables are now influenced by many more parameters than before.

It can further be seen that the first order indices rapidly decrease over time for all three observables, and are nearly zero at *t*=60 min after stimulation. The first order sensitivities for Raf and MEK are almost indistinguishable, which probably comes from the fact that both components also have very similar time courses after normalization. Similar to model version A, all three components are highly dominated by the dephosphorylation rate of Raf. Different from model version A, the feedback parameters *k*
_{
Fp
}, *g*, *M* and *K* become more prominent in the course of the first order indices especially for Raf and MEK. This is true for the first order and the total effect indices, and presumably comes from the fact that it regulates to a certain extend the time and the height of the maxima of all components, and therefore causes variances in the experiment used for normalization. The first order indices are all rather small for ERK over the entire time course.

The sum of total order effects are much larger than in model variant A for all time points and all components, showing that normalization indeed increases interaction effects among parameters. Moreover, all three components show a well balanced mixture of total order effects of all model parameters, suggesting that all normalized components are highly interconnected. Interpretation of Sobol indices of model version C and their meaning for the biological system is generally difficult.

## Conclusions

We have demonstrated that normalized sensitivity coefficients and Sobol indices are invariant under simple rescaling of model variables and parameters. This is, however, different for a normalization to a reference experiment, whose value depends itself on model parameters. Such a normalization may change the results of both local and global sensitivity analysis completely. This has to be taken into account when working with relative data that are normalized to a reference experiment and models that are normalized in the same way to reproduce these relative data. Interpretation of the sensitivity coefficients or Sobol indices can be very difficult in this case. In particular, it is generally not possible to extract any information about respective changes of the unnormalized model trajectories. A sensitivity coefficient near zero, for example, just indicates that the relative changes of the reference experiment value and the respective considered model output are of the same order of magnitude. Thus, parameters that have a large impact on model outputs and appear to be important in an unnormalized model version, might have small sensitivity values in a normalized model version, and vice versa.

Dealing with relative data and corresponding normalized models generally poses a challenge from the modeling point of view. As shown in this work, it renders the interpretation of sensitivity analysis and its meaning for the biological system a difficult task. Furthermore, related to this issue, it complicates model inference and in particular parameter estimation. Parameter estimation is often formulated as an optimization problem with an objective function that comprises a comparison of the relative experimental data with the respective model predictions. Evaluation of this objective function requires two simulations for one experimental value, the reference experiment value, which is used for normalization, and the actual experiment. Hence although the objective function is independent of any scaling factors, factors of proportionality *α* have to be chosen for the individual simulations, which often causes numerical problems when not chosen properly. We also encountered such numerical instabilities in our sensitivity analyses, which requires some care in the choice of these factors.

In conclusion, the calibration and analysis of normalized models is challenging, and proper normalization methods and their impact on the analysis results remain an issue for further studies.

## Methods

### Local sensitivity analysis and the direct differential method

Local sensitivity analysis investigates the influence of a parameter *p* on a model output *y*
_{
i
}(*t*,*p*) around a reference parameter set *p*
^{∗}. Sensitivity coefficients are formally defined as first-order partial derivatives of the model output *y*
_{
i
}(*t*,*p*) with respect to the parameter *p*
_{
j
},

If not analytically available, *S*
_{
ij
}(*t*) can in the simplest case be approximated via finite differences, i.e.

with *Δ*
*p*
_{
j
} being a vector with zero entries except for component *p*
_{
j
}. This approximation works quite well in many settings, but robustness should be tested via varying the step size *Δ*
*p*
_{
j
}. The direct differential method instead solves a differential equation for the sensitivity coefficients. Therefore, we consider the time evolution of *S*
_{
ij
}(*t*),

This can be summarized as

or, in a more compact form,

This differential equation system for *S*
_{
j
} can be solved numerically. It involves the Jacobian matrix *J*
_{
f
} of the system, which has to be defined and implemented. The direct differential method does not rely on the choice of an appropriate *Δ*
*p*, but can be very time consuming especially for larger systems.

### Variance-based sensitivity analysis

The main idea of variance-based sensitivity analysis methods is to decompose the variance of a model outcome according to the input factors. Variance-based methods are global methods, since they exploit the impact of parameters within a whole range of values. Moreover, in contrast to local sensitivity factors, they allow for the investigation of interaction effects between groups of parameters. Sobol indices are sensitivity measures that are based on average partial variances.

To simplify notation, we adapt in the following to formulas derived in [16, 17], where *Y* denotes a particular scalar model output (i.e. an output variable at a particular time point) and *X* is the set of model parameters. Similar to a Bayesian framework,
$X\in {\mathbb R}^{n}$
are random variables, and *Y* is considered to be a function of these input parameters,

with expectation value
$\mathbb {E}(Y)$
and variance Var(*Y*). Importantly, all *X*
_{
i
} are assumed to be independent for the following procedure, and hence can be drawn independently from their marginal distributions.

In order to investigate the contribution of each factor *X*
_{
i
} to the total variance Var(*Y*), we average over the conditional variances, which is the resulting variance of *Y* when the factor *X*
_{
i
} is fixed to a value
$x_{i}^{*}$
,

Here, the expectation is taken with respect to the values that *X*
_{
i
} can take, and
$\operatorname {Var}_{\sim X_{i}}(Y|X_{i}=x_{i}^{*})$
is the conditional variance of *Y* after fixing *X*
_{
i
} to a value
$x_{i}^{*}$
. According to the law of total variance, Var(*Y*) can be decomposed as

Figure 9 illustrates this decomposition. The two figures on top correspond to the density plots on the right hand sides of Figs. 2 and 3. In Fig. 2 the densities
$f_{Y^{A}|k_{1}^{*}}$
and
$f_{Y^{A}|k_{2}^{*}}$
are analytically accessible, as described in the text, while these quantities are analyzed via Monte Carlo sampling, as illustrated in Fig. 3. The summary statistics from these conditional densities that are exploited in the Sobol analysis are presented in the figures on the bottom in Fig. 9. Here, the terms
$\operatorname {Var}_{\sim X_{i}}(Y|X_{i}=x_{i}^{*})$
and
$\mathbb {E}_{\sim X_{i}}(Y|X_{i}=x_{i}^{*})$
are the variance and expectation value of *Y* within a slice
$X_{i}=x_{i}^{*}$
, and Var(*Y*) is given as the expectation of variances and the variance of expectations over all slices. This figure shows that if *Y* and *X*
_{
i
} are highly correlated, then the first term in Eq. (41) is small, and the second term is large. Hence it is reasonable to define the first-order sensitivity index of *X*
_{
i
} on *Y* as

The idea of conditional variance can be extended by conditioning to two or more factors, i.e.

where
$\operatorname {Var}_{ij}^{\,c}$
measures the joint effect of *X*
_{
i
} and *X*
_{
j
} on the output *Y*. *S*
_{
ij
} is denoted second-order index. Higher-order indexes can be derived accordingly.

In fact, it was shown that the total variance can be decomposed into effects of different orders ([16] and references therein),

This gives rise to define the total effect term for each parameter as

which contains all terms of any order that include *x*
_{
i
}.

The Russian mathematician I.M. Sobol proposed a straightforward Monte Carlo-based estimation procedure for the first and total order sensitivity indices for the special case that the *X*
_{
i
} are sampled from the standard uniform distribution *U*(0,1).

Assuming *f* is square-integrable, we consider an expansion of *f* into a sum of terms of increasing dimension,

which is unique if each term has zero mean, since all terms are orthogonal in pairs in this case,

with *Ω*
_{
i
} and *Ω*
_{
j
} denoting subsets of the index set {1,…,*n*}.

In particular, we can identify

and, comparing with Eq. (42), we see that

Moreover,

Var*ij* is denoted second-order effect. Using the expansion in Eq. (46) to calculate the variance Var(*Y*), and exploiting that the means of the individual summands vanish and that the terms are orthogonal, we get the following decomposition

which, upon division by Var(*Y*), leads to

This decomposition gives rise to define the total effect index
$S_{T_{i}}$
of a component *X*
_{
i
} as the total effect of *X*
_{
i
} on *Y*, which is the sum of all sensitivity indices containing *X*
_{
i
}. *S*
_{
Ti
}=0 implies that *X*
_{
i
} does not influence *Y* at all and hence *X*
_{
i
} could for instance be set to a fixed value for further analysis.

Notably, $S_{T_{i}}$ can be calculated as efficiently as the first order indices. Therefore, we again exploit the law of total variance in the following way

which leads via division by Var(*Y*) to

since
$\mathbb {E}_{\sim X_{i}}(\operatorname {Var}_{X_{i}}(Y|X_{\sim i}))$
is the average variance after fixing all but variable *X*
_{
i
}.

The numerical procedure that is used to estimate *S*
_{
i
} and
$S_{T_{i}}$
is described in [16] and in [26] (a more recent study with a focus on implementation is [27]) and uses Monte Carlo integration to evaluate the integrals.

## References

- 1
Segel LA. Modeling Dynamic Phenomena in Molecular and Cellular Biology. Cambridge: Cambridge University Press; 1984.

- 2
Möller Y, Siegemund M, Beyes S, Herr R, Lecis D, Delia D, Kontermann R, Brummer T, Pfizenmaier K, Olayioye M. EGFR-targeted TRAIL and a Smac mimetic synergize to overcome apoptosis resistance in KRAS mutant colorectal cancer cells. PLoS ONE. 2014; 9(9):e107165.

- 3
Zinöcker S, Vaage J. Rat mesenchymal stromal cells inhibit T cell proliferation but not cytokine production through inducible nitric oxide synthase. Front Immunol. 2012; 3(62):1–13.

- 4
Taylor SC, Posch A. The design of a quantitative Western blot experiment. BioMed Res Int. 2014; 2014:8. Article ID 361590.

- 5
Degasperi A, Birtwistle MR, Volinsky N, Rauch J, Kolch W, Kholodenko BN. Evaluating strategies to normalize biological replicates of Western Blot data. PLoS ONE. 2014; 9(1):e87293.

- 6
Kreutz C, Rodriguez MMB, Maiwald T, Seidl M, Blum HE, Mohr L, Timmer J. An error model for protein quantification. Bioinformatics. 2007; 23(20):2747–53.

- 7
Thomaseth C, Radde N. Normalization of Western blot data affects the statistics of estimators. 2016. Submitted to FOSBE.

- 8
Saltelli A, Tarantola S, Campolongo F, Ratto M. Sensitivity Analysis in Practice: a Guide to Assessing Scientific Models. Chichester: John Wiley & Sons; 2004.

- 9
Frank PM. Introduction to System Sensitivity Theory. New York: Academic Press Inc; 1978.

- 10
Zi Z. Sensitivity analysis approaches applied to systems biology models. IET Syst Biol. 2011; 5(6):336–46.

- 11
Kim KA, Spencer SL, Albeck JG, Burke JM, Sorger PK, Gaudet S, Kim DH. Systematic calibration of a cell signaling network model. BMC Bioinf. 2010; 11(202):1–14.

- 12
Kiparissides A, Kucherernko SS, Mantalaris A, Pistikopoulus EN. Global sensitivity analysis challenges in biological systems modeling. Ind Eng Chem Res. 2009; 48(15):7168–80.

- 13
Kent E, Neumann S, Kummer U, Mendes P. What can we learn from global sensitivity analysis of biochemical systems?PLoS ONE. 2013; 8(11):e79244.

- 14
Santos SDM, Verveer PJ, Bastiaens PIH. Growth factor-induced MAPK network topology shapes Erk response determining PC-12 cell fate. Nat Cell Biol. 2007; 9(3):324–30.

- 15
Jensch A, Thomaseth C, Radde N. Sampling-based Bayesian approaches reveal the importance of quasi-bistable behavior in cellular decision making processes on the example of the MAPK signaling pathway in PC-12 cell lines. Under review. 2016. BMC Systems Biology (Under review).

- 16
Saltelli A, Ratto M, Andres T, Campolongo F, Cariboni J, Gatelli D, Saisana M, Tarantola S. Global Sensitivity Analysis: The Primer. Hoboken, NJ: John Wiley & Sons; 2008.

- 17
Sobol IM. Global sensitivity indices for nonlinear mathematical models and their Monte Carlo estimates. Math Comput Simul. 2001; 55:271–80.

- 18
Zheng Y, Rundell A. Comparative study of parameter sensitivity analyses of the TCR-activated ERK-MAPK signalling pathway. IEE Proc Syst Biol. 2006; 153(4):201–11.

- 19
Zhang XY, Trame MN, Lesko LJ, Schmidt S. Sobol sensitivity analysis: a tool to guide the development and evaluation of systems pharmacology models. CPT Pharmacometrics Syst Pharmacol. 2015; 4:69–79.

- 20
Hinkley DV. On the ratio of two correlated normal random variables. Biometrika. 1969; 56(3):635–9.

- 21
Marsaglia G. Ratios of normal variables. J Stat Softw. 2006; 16(4):1–10.

- 22
Kholodenko B. Negative feedback and ultrasensitivity can bring about oscillations in the mitogen-activated protein kinase cascade. Eur J Biochem. 2000; 267(6):1583–88.

- 23
Kolch W. Coordinating ERK/MAPK signalling through scaffolds and inhibitors. Nat Rev Mol Cell Biol. 2005; 6:827–37.

- 24
Kolch W, Calder M, Gilbert D. When kinases meet mathematics: the systems biology of MAPK signalling. FEBS Lett. 2005; 579(8):1891–95.

- 25
Gouzé JL. Positive and negative circuits in dynamical systems. J Biol Syst. 1998; 6(11):11–15.

- 26
Saltelli A, Annoni P, Azzini I, Campolongo F, Ratto M, Tarantola S. Variance based sensitivity analysis of model output. Design and estimator for the total sensitivity index. Comp Phys Comm. 2010; 181:259–70.

- 27
Bilal N. Implementation of Sobol’s method of global sensitivity analysis to a compressor simulation model. In: 22nd Int. Compressor Eng. Conf. Purdue, USA: Purdue e-Pubs: 2014.

## Acknowledgements

This work was supported by the German Federal Ministry of Education and Research (BMBF) within the e:Bio-Innovationswettbewerb Systembiologie project PREDICT (grant number FKZ0316186A) and the German Research Foundation (DFG) within the Cluster of Excellence in Simulation Technology (EXC 310/1) at the University of Stuttgart.

## Author information

## Additional information

### Competing interests

The authors declare that they have no competing interests.

### Authors’ contributions

JK, CT and NR designed the study. CT, AJ and NR developed and provided the modeling framework. JK implemented the sensitivity analysis methods and created the results with the help of all other authors. All authors wrote and approved the manuscript.

## Rights and permissions

licensee Springer on behalf of EPJ. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

## About this article

### Cite this article

Kirch, J., Thomaseth, C., Jensch, A. *et al.* The effect of model rescaling and normalization on sensitivity analysis on an example of a MAPK pathway model.
*EPJ Nonlinear Biomed Phys* **4, **3 (2016) doi:10.1140/epjnbp/s40366-016-0030-z

#### Received

#### Accepted

#### Published

#### DOI

### Keywords

- Global sensitivity analysis
- Sobol indices
- Model rescaling
- MAPK signaling pathway