2018’s top 100 journal articles In the news

Many Analysts, One Data Set: Making Transparent How Variations in Analytic Choices Affect Results [Top 100 journal articles of 2018]

Bruce Boyes17 Jan 2019

441 2 minutes read

This article is part 7 of a series reviewing selected papers from Altmetric’s list of the top 100 most-discussed journal articles of 2018.

Data analysis in research can be seen as mechanical and unimaginative, so it’s easy to overlook the fact that the results may depend on the chosen analytic strategy. While researchers may be aware of this, there is little appreciation for the implications in practice.

But what if scientific results are highly contingent on subjective decisions made at the analysis stage? The consequences could include results that are fraught with unrecognized uncertainty, research findings that are less trustworthy than they at first appear to be, or even observing a entirely different result.

To address the current lack of knowledge in regard to the implications of analytic decisions, the authors of an August 2018 paper¹ explored the impact of the analytic decisions of 29 teams that analyzed the same data set to answer the same research question.

It was found that analytic choices varied widely across the teams, and following on from this, so did their results. Twenty teams found a statistically significant positive effect, while 9 teams did not observe a significant relationship. These findings show how researchers can vary in their analytic approaches, and how results can vary according to these analytic choices. The authors alert that:

The observed results from analyzing a complex data set can be highly contingent on justifiable, but subjective, analytic decisions. Uncertainty in interpreting research results is therefore not just a function of statistical power or the use of questionable research practices; it is also a function of the many reasonable decisions that researchers must make in order to conduct the research.

The authors conclude with this advice:

The best defense against subjectivity in science is to expose it. Transparency in data, methods, and process gives the rest of the community opportunity to see the decisions, question them, offer alternatives, and test these alternatives in further research.

They’re clearly practicing what they preach, with the paper receiving Association for Psychological Science badges for Open Data and Open Materials.

Author abstract

Twenty-nine teams involving 61 analysts used the same data set to address the same research question: whether soccer referees are more likely to give red cards to dark-skin-toned players than to light-skin-toned players. Analytic approaches varied widely across the teams, and the estimated effect sizes ranged from 0.89 to 2.93 (Mdn = 1.31) in odds-ratio units. Twenty teams (69%) found a statistically significant positive effect, and 9 teams (31%) did not observe a significant relationship. Overall, the 29 different analyses used 21 unique combinations of covariates. Neither analysts’ prior beliefs about the effect of interest nor their level of expertise readily explained the variation in the outcomes of the analyses. Peer ratings of the quality of the analyses also did not account for the variability. These findings suggest that significant variation in the results of analyses of complex data may be difficult to avoid, even by experts with honest intentions. Crowdsourcing data analysis, a strategy in which numerous research teams are recruited to simultaneously investigate the same research question, makes transparent how defensible, yet subjective, analytic choices influence research results.

Header image source: 462634 on Pixabay, Public Domain.

Reference:

Silberzahn, R., Uhlmann, E. L., Martin, D. P., Anselmi, P., Aust, F., Awtrey, E., … & Carlsson, R. (2018). Many analysts, one data set: Making transparent how variations in analytic choices affect results. Advances in Methods and Practices in Psychological Science, 1(3), 337-356. ↩

Rate this post

Also published on Medium.

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.

Author abstract

Bruce Boyes

Related Articles

More on the problems in science: the five diseases and other perspectives

New research highlights how error-ridden data used to train AI is

Exploring the science of complexity series (part 5): Aims and overview of the series

Critical Eye: The damage of “science by media release”