The Victorian Police recently made the news for the discovery that its police had faked 258,000 breath tests in order to meet KPI targets. The propensity of KPIs to distort behavior is well known; yet managers continue to impose them and be surprised at the resulting distortion of behaviour and gaming of the system.
For instance, in 2004, 10 top-level targets applying to the Health Department in England were translated into some 300 lower-level targets for the various public sector health-delivery organizations for which that department was responsible, and six top-level targets applying to the Education Department were translated into 90 “conditions”.
Bevan and Hood have done extensive analysis of the outcomes of this social experiment with quite damning results. They found that the problems of ‘synecdoche’, or treating the measurement of the part as indicative of the performance of the whole, generally led to significant problems outside of the areas measured by the KPI targets, along with extensive gaming of the targets. The problem is that it is almost always impossible to find good measures for every aspect of a system that is deemed important. This leads to four domains that interact with KPIs:
- Monitored: A domain has a metric that tightly and accurately correlates to reality
- Flagfall: A domain has a metric that can indicate the need for further investigation, but by itself is inaccurate and incomplete
- Foggy: Outcomes in this domain matter, but no usable metrics exist to monitor them
- Residual: Outcomes in this domain are deemed to be irrelevant
Bevan and Hood found that the behaviour within the monitored domain was highly responsive to KPIs, but gaming at the threshold of the monitored domain was extensive. The classic example is the use of arbitrary time targets for health care. In the diagram below of ambulance response times, can you guess the threshold for acceptable performance? (Hint: it’s 8 minutes)
Actors constrained by KPI targets tend to fall in four classes:
- Saints exhibit such a high standard of public service ethos that they will voluntarily disclose shortcomings to central authorities, as with St George’s Healthcare in the UK, which twice drew attention to its own failures in its heart and lung transplantation programme, and voluntarily stopped performing these operations.
- Honest triers do not voluntarily draw attention to their failures, but do not attempt to spin or fiddle data in their favour, as with the Bristol Royal Infirmary which did not attempt to conceal evidence of very high mortality in its paediatric cardiac surgery unit. The subsequent inquiry acknowledged in its report that this was “not an account of bad people … [nor] of people who did not care, nor of people who wilfully harmed patients”.
- Reactive gamers broadly share the goals of central controllers, but will game the target system given reasons and opportunities to do so. During the 2005 British general election, a voter pointed out to Tony Blair that a target for GPs to see their patients within 48 hours led to most clinics simply stopping booking appointments more than 48 hours in advance.
- Rational maniacs do not share the goals of central controllers and aim to manipulate data to conceal their operations, such as the GP who killed at least 215 of his patients between 1975 and 1998, but was able to stop killing when he felt he was under suspicion.
The major problem with KPIs is that if they are accepted as valid, achievement of the KPI supercedes the nominal goals of the system. The NAPLAN educational benchmark in Australia has well and truly jumped the shark in this regard, with local and international evidence that low-performing students are being targeted for exclusion from testing in order to boost local benchmarks.
Bevan and Hood note that in Soviet Russia, the most notable previous attempt to improve performance via KPIs “[since] all bodies responsible for supervising enterprises were interested in the same success indicators, the supervisors, rather than acting to check, connived at, or even encouraged, gaming”.
It’s a sobering reminder that KPIs should be used with great caution.