Systems & complexity Systems thinking and modelling

Models and Truth: Introduction [Systems thinking & modelling series]

Scott Fortmann-Roe and Gene Bellinger7 Aug 2017

529 5 minutes read

This is part 34 of a series of articles featuring the book Beyond Connecting the Dots, Modeling for Meaningful Results.

“All models are wrong, but some are useful” – George E.P. Box

A model is a tool designed to reflect reality. A model is never a perfect mirror of reality, but often models can still be useful even with their imperfections. In this chapter, we will take a journey to explore different types of models and the distinctions commonly used to classify and understand them. We will consider several approaches to modeling that are quite different from the ones we have introduced throughout this book. These will help you understand the richer ecosystem of modeling tools and techniques and how the ones we have learned fit within this ecosystem.

The ultimate destination of this journey will be a clear understanding of the fundamental principles and approaches used to construct models. We will make many detours before arriving at this destination. In the end we will be able to divide models into two overarching categories based on their purposes and the techniques used to construct them. By mastering this divide, and how the work we and others do fits into it, we will obtain a rich perspective and understanding of the relationship between models and truth. We will also have a renewed appreciation for the strength and power of the techniques introduced in this book for tackling a wide swath of modeling problems.

Before we get there, however, let’s introduce some of the terminology commonly used to describe models. We’ll begin by taking a step back to discuss different kinds of models. Modeling is a wide-ranging field with many distinctions made by modelers and mathematicians. Three of these distinctions are presented below:

Deterministic versus Stochastic Models

There are two polar opposite views of the world. The Deterministic view says the fate of the universe is governed by strictly predictable laws of physics. In this view, the universe acts as if it were a giant machine; if its current state is known (down to each individual atomic particle), its future states through the rest of time are predetermined. The opposite (Stochastic) view is that the universe is ruled by chance and randomness. Random quantum mechanical fluctuations merge and amplify leading to an infinite range of diverging possibilities.

Which of these two views holds more of the truth? We certainly do not know and it is possible that this will be a question that physicists will never cease exploring. Albert Einstein had a viewpoint of special interest, however. He was a strong partisan of the more deterministic view, famously remarking, “God does not play dice with the world”

When creating a model of a system, we must choose how we treat chance. Do we build our model deterministically, such that each time we run it we obtain the same results? Or do we instead incorporate elements of uncertainty so that each time the model is run we may see a different trajectory of outcomes?

Mechanistic versus Statistical Models¹

When beginning to build a model of a system, there are many questions you should ask, two of which are:

Do I know (or have a hypothesis of) the mechanisms that drive the system?
Do I have data that describe the observed behavior of the system?

If the first question is answered in the affirmative, you can build a mechanistic model that replicates your understanding (or hypothesis of) the true mechanisms in the system. If, on the other hand, the second question is answered in the affirmative, you can use statistical algorithms such as regressions to create a model of the system based purely on the data.

If neither question is answered affirmatively…well, in that case there isn’t much of anything you can build.

Exercise 4-1
A credit card company has hired you to build a model to predict defaults of new applicants. They give you a data set containing information on one million of their previous customers along with whether or not those customers ultimately defaulted. Would it be better to build a mechanistic or statistical model for this data? Answer available >

Exercise 4-1

A credit card company has hired you to build a model to predict defaults of new applicants. They give you a data set containing information on one million of their previous customers along with whether or not those customers ultimately defaulted.

Would it be better to build a mechanistic or statistical model for this data?

Answer available >

Exercise 4-2
You have been commissioned to build a model of population growth for a herd of zebra in Namibia. You have some data on the historical size of the population of zebras but this data is limited. You also have access to more than a dozen experts who have studied zebras their whole life and have an intimate understanding of the behavior of the zebras. Would it be better to build a mechanistic or statistical model for this data? Answer available >

Exercise 4-2

You have been commissioned to build a model of population growth for a herd of zebra in Namibia. You have some data on the historical size of the population of zebras but this data is limited. You also have access to more than a dozen experts who have studied zebras their whole life and have an intimate understanding of the behavior of the zebras.

Would it be better to build a mechanistic or statistical model for this data?

Answer available >

Aggregated versus Disaggregated Models

When building a model, the issue of scale becomes very important. Imagine we are concerned about the effects of Global Climate Change on water resources. We may wish to examine the question of whether there will be sufficient water supplies given a rise in future temperatures.

At what scale do we build this model? The range of possible scales is wide:

At the most aggregate, we could estimate total worldwide water demands and supplies into the future.
Maybe that is too coarse a scale, however, as clearly having excess water in Norway has little direct impact on the situation in Egypt. We could instead create a finer resolution model that separately looked at water demand and consumption in each country.
Even that may still be too coarse. Maybe we should make our model more granular to look at specific cities or population clusters around the globe.
At the extreme disaggregated level, we might even want to model individual people—all 7 billion of them—and their needs and movements around the world.

There is no simple answer to this question of optimal scale. The best choice is highly context-sensitive and depends on the needs of the specific modeler and application.

Exercise 4-3
You have been hired to build a model of world population growth. What is an appropriate level of aggregation/disaggregation for this model? Does your answer change if you vary the time scale? What would be the differences between a model designed to work 10 years in the future, one designed to work for 100 years, and one designed to work for 1,000 years?

Exercise 4-4
Your company builds rulers. You have been asked to develop a model of global demand for rulers. What is an appropriate level of aggregation/disaggregation for this model?

Next edition: Models and Truth: Prediction, Inference, and Narrative.

Article sources: Beyond Connecting the Dots, Insight Maker. Reproduced by permission.

Header image source: Beyond Connecting the Dots.

Notes:

This relates more broadly to the contrasting research approaches of induction and deduction. Induction starts with data and observations, which are analyzed to create a broader theory (similar to a statistical approach to modeling). Deduction starts with a theory and finishes with the collection of data to confirm the theory (similar to a more mechanistic approach to modeling). It is easy to confuse the meanings of induction and deduction; even great minds have done so. While Sir Arthur Conan Doyle’s character Sherlock Holmes attributes his impressive powers to “deduction”, he is actually using induction. Treating what we are calling “statistical” models here as a form of induction, we can also refer to them as “phenomenological” or “empirical” models. ↩

Rate this post

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.

Deterministic versus Stochastic Models

Mechanistic versus Statistical Models1

Aggregated versus Disaggregated Models

Scott Fortmann-Roe and Gene Bellinger

Related Articles

Exploring the science of complexity series (part 7): Origins of complexity sciences

Applied Understanding: Restaurant Covers [Systems thinking & modelling series]

Key systems thinking lessons from Donella Meadows

Applied Understanding: Swamping Insights [Systems thinking & modelling series]

Mechanistic versus Statistical Models¹