This is part 46 of a series of articles featuring the book Beyond Connecting the Dots, Modeling for Meaningful Results.
When tackling modeling projects such as our hamster-population model, there are two basic overarching project management approaches. The first is founded on detailed planning and preparation. Tackling the hamster model using this approach might look something like the following sequential phases:
- Research: Find and obtain relevant literature on Aquatic Hamsters. Read peer-reviewed publications. Locate hamster experts and interview them. Identify key mechanisms affecting hamster population growth. Some mechanisms may require further study. For example, if human expansion and urbanization affect the hamster habitat area, you may need to study the forces influencing urbanization. These may require additional literature searches and expert interviews.
- Design: Once you have completed your background research on the hamsters, start to design the model. Create causal loop diagrams and develop stock and flow diagrams. Break your hamster population model into different sectors. You will have the hamster-specific sector, which includes sub-sectors for each of the life-stages these endangered hamsters go through. You will also need sectors for other parts of the model that affect the hamster population growth: an urbanization sector with its own model, a climate sector with a climate model, and so on. Write out equations for all these sectors and resurvey experts you have contacted to review the overall model design and the specific equations. There will probably be several cycles of iteration and model expansion during this stage as additional key areas to include are identified.
- Construction: Now that you have completed a model design and received a seal of approval from experts in the field, you are ready to start building the model itself. Decide what modeling software package (or programming environment) you will use. Implement the equations as they were specified in the design phase.
- Wrapping Things Up: Go through the confidence building steps from the previous chapter. Develop tests for your model to ensure it works correctly. Create model documentation. Show that the model demonstrates expected behavior and obtain final approval from experts.
This approach to building a model is a very linear process where you go sequentially from stage to stage. In the project management field, this is the classic “waterfall” project where you proceed phase by phase through the project. You plan out the whole thing ahead of time, estimating how long each phase would take and identifying dependencies between phases. This form of project management is well suited for certain kinds of projects such as constructing a building, and can work well if done expertly.
In our opinion, however, this approach to tackling a project is quite poorly suited to the task of building a model. There are several reasons for this.
First, each model is inherently unique1. You may have developed a dozen different population models in your career, but when it comes to developing a model for a new species or location, you will inevitably run into situations and problems you have never encountered before. The quantity and quality of data will differ from the cases before. Or the biology of the animal you are modeling will be different. Or the model goals and constraints will be different, and so on. Given these differences, rigid project management techniques such as the waterfall approach do not generally provide the predictability that is needed.
Secondly, when building a model you will find that many of your assumptions may simply be wrong. This can happen with every aspect of model construction: the data you thought you had will turn out to be non-existent, the equations provided to you by experts end up not working, and the model code you write will invariably have a bug or two that needs to be identified and squashed. Because of this you will continually need to adjust and adapt your model as you learn more about the system and what information you can rely upon and what you cannot.
Such a high likelihood of error and need for readjustment are not well suited to techniques based on sequential, long-term planning formats. What good is a great plan if the assumptions it is based on are substantially wrong?
Take, for instance, the data you use to build your model. It is not uncommon for a collaborator to come to you and say we have X, Y and Z data series for you to use in your model (where these might represent environmental conditions or other important model inputs). When you check the data, however, you may find that in fact X does not exist (the collaborator was confused), Y actually has large gaps in the data set that make it effectively useless for your needs, and Z was collected in such a way that they were actually measuring something completely different than they thought they were.
Take, as another instance, the equations in a model. Imagine you consult an expert on Aquatic Hamsters and she provides an equation governing the survival of hamsters during their first year of life. This equation was developed as part of a scientific study where the hamsters were grown in indoor swimming pools at her university’s Aquatic Hamster Research Facility. When you apply this equation in your model, however, you discover that the way hamsters behave when living in an indoor swimming pool is very different from their behavior in the wild. Because of this, the equation you have is simply not accurate for the hamsters living in the wild.
Errors like these two examples are very common. If you had proceeded with the classic waterfall approach to modeling you might not realize that you cannot rely on the data or equations you were planning to use until the very end of the modeling process. At this point it is much too late to go back and correct your model.
Iteration: Failing Fast and Failing Often
Because of this, we advocate an alternative approach to building models. We support jumping right into the model construction process as early as possible. As we showed you in the Red example from Chapter 4, we think it is important to get a simulation model up and running as quickly as possible. You should never want to be more than a few steps away from a simulating model2.
When beginning a modeling project we recommend building the simplest model possible to get going. We call this the Minimum Viable Model3 and it is the model that contains just enough to minimally represent the system and nothing more. For the hamster model, this Minimum Viable Model might contain just a single stock representing the hamster population and a couple of flows modifying the population. Nothing more.
You don’t have to worry about your equations being right or your model being an accurate predictor in the Minimum Viable Model; you just want to get something up and running as soon as possible.
Once you have the Minimum Viable Model you can ask people to review it and begin to incorporate their feedback. So get your friend’s thoughts on the minimal hamster model, talk to experts, study the model’s forecasts, and see what works and what does not. Then iterate on the model: make a change here, add a new component there. If you get feedback that no one trusts the model because it does not contain some key mechanism, add that mechanism to the model4. Steadily adjust and refine the model based on the actual results of the model and the feedback you receive.
This feedback will be more useful when you have a concrete model that is simulating than it would be if you were just running abstract ideas by people. By putting your stake in the ground with a model that simulates, you allow others to critique and engage with the model, providing you with valuable information about what works and what does not. If you do not come with a concrete model, you run the risk of receiving very vague, unactionable feedback.
What is best about this approach of rapid iteration is that it allows you to identify failures quickly. If a data source is no good, you find that out immediately as you try to integrate it rather than spending days, weeks, or months planning your model with the assumption that it’s really there or you can really use it. Rapid iteration – failing fast and failing often – is a key goal in the model development process. It can be argued that your successes in life are directly proportional to the number of failures and wrong turns you take: the more things you try, the more times you will both succeed and fail. We believe the same is often true in modeling. By speeding up the process of identifying and iterating past failures, this agile approach to modeling will often result in higher quality models completed more quickly than approaches that rely on extensive planning.
Next edition: The Process of Modeling: Model Boundaries.
Header image source: Beyond Connecting the Dots.
- Lots of “cookie cutter” models out there are designed to model a certain class of problems. Without custom work, however, these models are of dubious validity and may serve more to “check a box” that a model has been built rather than to be a useful decision-making tool. ↩
- This is a common theme in agile approaches to project management. You never want to be far from a working product. For instance, in the popular Scrum approach to managing software projects, the key unit of collective work is “the sprint”. A sprint is a relatively brief amount of time (in the scope of the entire project) to complete a set group of product features. At the end of the sprint, the features must be completed and the software working or they are cut. The goal is always to be close to a working program just like you should always be close to a working model. ↩
- This idea is adapted from Eric Ries’s excellent book The Lean Startup (Ries, E. (2011). The Lean Startup. New York: Crown Business.). In it he advocates an approach to developing start-up companies and businesses focus on rapid development and innovation. Ries supports developing a “Minimal Viable Product” for the company as quickly as possible and iterating on the feedback received for this initial product. ↩
- But the key is to wait until you get this feedback. It’s easy on your own or with a group of people to make a list of dozens of mechanisms that a model must contain to be realistic. Once you have implemented those mechanisms in your model you might find out that no one actually cared about them. It is best to start small and then augment the model when there is a demand for some additional mechanism, than it is to spend a long time implementing a very complex model only to find out much of that work was unnecessary. ↩