Systems & complexity Systems thinking and modelling

Models and Truth: Prediction, Inference, and Narrative [Systems thinking & modelling series]

Scott Fortmann-Roe and Gene Bellinger14 Aug 2017

998 4 minutes read

This is part 35 of a series of articles featuring the book Beyond Connecting the Dots, Modeling for Meaningful Results.

The three distinctions just presented—deterministic vs. stochastic, mechanistic vs. statistical, aggregated vs. disaggregated—can be used to classify models. We can even use them to classify the models we have discussed in this interactive learning environment (ILE). Most of our models would be classified as deterministic (random chance is generally not explicitly incorporated in these models), mechanistic (we generally assume mechanisms rather than estimating dependencies from data), and highly aggregated (the agent based models are an exception).

There are many nuances to these broad distinctions (e.g., the type of statistical techniques used for a statistical model). Many other distinctions can be made between model implementations such as, for example, the programming language or software that was used to implement the model. These distinctions and technical choices are important when constructing a model, however, what is of key importance is the utility of the model for fulfilling a specific goal.

Technical details matter—they can affect maintainability and other factors—but they are of secondary interest to the adequacy of a model in fulfilling its main purpose. It would make as little sense to say a model was fundamentally bad because it was written in a relatively ancient programming language such as Fortran, as it would to say a model was fundamentally bad because it was, for instance, deterministic. Let’s look back at Box’s quote at the beginning of this chapter. We know all models are wrong, what we should really care about is their utility in meeting a specific task.

So rather than using the aforementioned technical classifications to discuss models, we think it is more useful to base our discussions of models on the model’s driving purpose. This allows us to leave behind relatively mundane technical and implementation details to focus on what we really care about. Among the many different reasons for building models, they all boil down basically to three broad purposes displayed in Figure 1: prediction, inference, and narrative.

Prediction: Models used for prediction are the most straightforward. They attempt to forecast an outcome given information about variables that may influence that outcome. A weather forecast is an example of a model used for prediction. Likewise, when you apply for a credit card, the bank runs a predictive model to determine your risk of not paying them back and defaulting. Life insurance companies use a model that predicts how long an applicant is expected to live. The results determine the premium charged. All these models take in data (the current temperature for the weather forecast, the amount of money in your bank account for your risk of default, your age for the life insurance application) and apply various forms of analysis to generate a prediction of the outcome.

Inference: Models used for inference are most common in academic research. Often, academic research questions distill down to this simple template: “Does X affect Y?” These are inferential questions¹. As an example, a researcher may make a hypothesis statement such as, “The wealthier a high-school student’s family, the higher the student’s test scores will be”. The researcher may then build a model to test the validity of this hypothesis. The model’s results will generally be phrased in terms of a p value indicating the statistical significance of the evidence in support of the hypothesis.

Narrative: Models are often used to tell a persuasive story. When the Obama administration wanted to persuade lawmakers and the public to support their economic stimulus, they famously published the graph shown in Figure 2. A great deal of complex modeling and mathematics surely went into constructing this figure. However, its core purpose was to tell the nation a story: Things are going to be bad, but the recovery plan will make them less so. Such stories are at the heart of narrative models. We will return to this figure later and discuss why it is not really a predictive model despite it generating predictions.

The Obama administration's predictions for the effects of the recovery plan — Figure 2. The Obama administration’s predictions for the effects of the recovery plan.

All models can be classified in terms of these three primary purposes. We will see how useful it is to discuss modeling projects in this manner².

Exercise 4-5
Classify each of these modeling tasks as primarily prediction, inference, or narrative tasks: A model to determine the average ocean temperature in 2020. A model to determine whether deforestation affects temperatures. A model to determine whether a company should supply a credit card to a specific applicant. A model to help students understand the risks of global climate change. A model to convince your manager to green-light your new initiative. A model to assess whether nutrition has an effect of infant mortality. Answer available >

Exercise 4-5

Classify each of these modeling tasks as primarily prediction, inference, or narrative tasks:

A model to determine the average ocean temperature in 2020.
A model to determine whether deforestation affects temperatures.
A model to determine whether a company should supply a credit card to a specific applicant.
A model to help students understand the risks of global climate change.
A model to convince your manager to green-light your new initiative.
A model to assess whether nutrition has an effect of infant mortality.

Answer available >

Next edition: Models and Truth: The Strange Case of Inference.

Article sources: Beyond Connecting the Dots, Insight Maker. Reproduced by permission.

Header image source: Beyond Connecting the Dots.

Notes:

Predictions are also inferential results, but we prefer to discuss prediction and more hypothesis-testing types of inference separately. This distinction makes our understanding of modeling clearer. ↩
And we strongly recommend doing so. It is important to clearly define the purpose at the start of a project. The techniques used and data required depend significantly on the model’s overall purpose. To be very clear, it is important to clarify at the outset whether your primary goal is to use a model for prediction or for narrative. Many modeling projects may attempt to do both only to find themselves with a model that does neither. ↩

Rate this post

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.

Scott Fortmann-Roe and Gene Bellinger

Related Articles

Going Global: Additional Tips [Systems thinking & modelling series]

Are creative individuals best at envisioning the future?

Cities are complex systems – let’s start looking at them that way

Pragmatism and critical systems thinking: Back to the future of systems thinking