Comment on the Leading Edge Primer “Physical and Mathematical Modeling in Experimental Papers” by Wolfram Möbius and Liedewij Laan in Cell from December, 17th, 2015.

DOI: http://dx.doi.org/10.1016/j.cell.2015.12.006

# Why should we model?

All living entities have to obey nature’s rules encoded in the laws of physics and chemistry. The language of those laws is mathematics. The ancient Greek philosopher Pythagoras of Samos already postulated that mathematics is an integral and universal part of all matter. As such, mathematics can help us **to describe a biological system** mechanistically, and to cope with its complexity. Recent technical advances in modern biosciences allow us to obtain **quantitative data** even on the -omics (i.e. e.g. genome- or proteome-wide) scale. Mathematical modeling enables systematic analysis of data sets to harness valuable information.

A model might be needed (…) to interpret data or to confirm or reject a hypothesis via generating predictions.

**Model-based hypothesis testing can facilitate answering biological questions. **Mathematical modeling should not be applied *per se* but merely to solve problems. As Michael P. Brenner puts it: “If the model does not tell you something new, it needs to go.” The purposes of a mathematical model could be to:

- Test hypotheses on biological mechanisms represented by the model structure
- Specify unknown parameters such as initial concentrations or kinetic rates
- Explore different spatial and temporal scales of biological systems

# What is a model?

A model is mostly represented by a **set of differential equations**. Numerical solutions to these equations, given the model parameters, are referred to as simulations to predict certain variables. While humans were reported not to be fat and fit at the same time (http://ti.me/NMUsGr), a huge mathematical model can still fit the experimental data very well.

A model needs to be as simple as possible and as complex as necessary.

The problem with huge models is that they become less informative with increasing size (i.e. numbers of parameters), which led to the famous quote by John von Neumann: “With four parameters, I can fit an elephant(…)“, meaning that models with high numbers of parameters are arbitrary.

**The number of model parameters and thus the complexity of the model depends on both, the experimental data and the scope of the model**. I do not agree with the authors that the complexity of the model (i.e. number of variables) should not exceed the complexity of the data (i.e. number of observables). We have seen that mathematical modeling can reveal information that are not accessible experimentally. In our case, we learned a great deal on the dynamics of the transcription factor STAT5 in the nucleus of BaF3-EpoR cells even though we could only measure cytoplasmic lysates (Boehm, Adlung *et al*., **J. Proteome Res.**, 2014, 13 (12)).

The number of parameters should nevertheless be reasonable. Otherwise the model is not informative. There are statistical measures to identify the most informative model structure. The idea behind is to find a “null model” with a minimal number of *n* parameters that describes the experimental data equally well as an alternative model with *n+1* parameters. If the extended model is significantly better (in agreement with the data), the null model is rejected instead.

Like experiments, models are often created in an iterative process (…).

# What do we need?

George E. P. Box taught us that “essentially, all models are wrong but some are useful.” In my eyes, the same holds true for experiments. **For reliable analysis of experimental data and model simulations alike, we need proper controls.** Modeling requires positive and negative controls, too. Certainly, a model works if it describes the experimental data and is cross-validated by additional experiments to verify predictions of the thus calibrated model. The question is to which extent are the model predictions reliable? Uncertainty analysis is very technical, there is a very simple advice by David R. Nelson though: “Do not stop when you see agreement between data and model.” Instead, play around with your model and test it systematically to see how trustworthy your prior assumptions are. Of course, all these procedures demand **a diverse range of skills**:

- understanding of the biological system
- expertise in experimental methods
- ability to conceptualize and scrutinize an idea
- tools for mathematical analysis
- programming experience

This can only be achieved by broadly trained people with background in experimental and theoretical disciplines.

Educations programs need to be adapted accordingly to provide the basis for a **multidisciplinary curriculum**. Likewise, collaborations between experts in the individual fields can yield benefits. Precondition is the ability to communicate results to non-experts. The limiting step then becomes our curiosity to discover something new. As Steve Jobs said: “Stay foolish.”