How can non-scientists judge the scientific claims they are confronted with? It may take a high level of skill to understand a science itself, but William J. Sutherland, David Spiegelhalter and Mark A. Burgman advise in comment in the journal Nature that non-scientists can more readily develop what they call “interpretive scientific skills”:
The essential skills are to be able to intelligently interrogate experts and advisers, and to understand the quality, limitations and biases of evidence.
To assist this, they have put forward a list of twenty tips. They recommend that the tips “should be part of the education of civil servants, politicians, policy advisers and journalists – and anyone else who may have to interact with science or scientists”.
The twenty tips are:
- Differences and chance cause variation. The real world varies unpredictably.
- No measurement is exact. Practically all measurements have some error.
- Bias is rife. Experimental design or measuring devices may produce atypical results in a given direction.
- Bigger is usually better for sample size. The average taken from a large number of observations will usually be more informative than the average taken from a smaller number of observations.
- Correlation does not imply causation. It is tempting to assume that one pattern causes another. However, the correlation might be coincidental, or it might be a result of both patterns being caused by a third factor – a ‘confounding’ or ‘lurking’ variable.
- Regression to the mean can mislead. Extreme patterns in data are likely to be, at least in part, anomalies attributable to chance or error.
- Extrapolating beyond the data is risky. Patterns found within a given range do not necessarily apply outside that range.
- Beware the base-rate fallacy. The ability of an imperfect test to identify a condition depends upon the likelihood of that condition occurring (the base rate).
- Controls are important. A control group is dealt with in exactly the same way as the experimental group, except that the treatment is not applied. Without a control, it is difficult to determine whether a given treatment really had an effect.
- Randomization avoids bias. Experiments should, wherever possible, allocate individuals or groups to interventions randomly.
- Seek replication, not pseudoreplication. Results consistent across many studies, replicated on independent populations, are more likely to be solid.
- Scientists are human. Scientists have a vested interest in promoting their work, often for status and further research funding, although sometimes for direct financial gain. This can lead to selective reporting of results and occasionally, exaggeration.
- Significance is significant. Expressed as P, statistical significance is a measure of how likely a result is to occur by chance.
- Separate no effect from non-significance. The lack of a statistically significant result (say a P-value > 0.05) does not mean that there was no underlying effect: it means that no effect was detected.
- Effect size matters. Small responses are less likely to be detected. A study with many replicates might result in a statistically significant result but have a small effect size (and so, perhaps, be unimportant).
- Study relevance limits generalizations. The relevance of a study depends on how much the conditions under which it is done resemble the conditions of the issue under consideration.
- Feelings influence risk perception. Broadly, risk can be thought of as the likelihood of an event occurring in some time frame, multiplied by the consequences should the event occur. People’s risk perception is influenced disproportionately by many things.
- Dependencies change the risks. It is possible to calculate the consequences of individual events, such as an extreme tide, heavy rainfall and key workers being absent. However, if the events are interrelated, then the probability of their co-occurrence is much higher than might be expected.
- Data can be dredged or cherry picked. Evidence can be arranged to support one point of view.
- Extreme measurements may mislead. Any collation of measures will show variability owing to differences in innate ability, plus sampling, plus bias, plus measurement error. However, the resulting variation is typically interpreted only as differences in innate ability, ignoring the other sources.
Also published on Medium.