Research Shows That Published Versions Of Papers In Costly Academic Titles Add Almost Nothing To The Freely-Available Preprints They Are Based On

from the all-that-glitters-is-not-gold dept

The open access movement believes that academic publications should be freely available to all, not least because most of the research is paid for by the public purse. Open access supporters see the high cost of many academic journals, whose subscriptions often run into thousands of dollars per year, as unsustainable for cash-strapped libraries, and unaffordable for researchers in emerging economies. The high profit margins of leading academic publishers — typically 30-40% — seem even more outrageous when you take into account the fact that publishers get almost everything done for free. They don’t pay the authors of the papers they publish, and rely on the unpaid efforts of public-spirited academics to carry out crucial editorial functions like choosing and reviewing submissions.

Academic publishers justify their high prices and fat profit margins by claiming that they “add value” as papers progress through the publication process. Although many have wondered whether that is really true — does a bit of sub-editing and design really justify the ever-rising subscription costs? — hard evidence has been lacking that could be used to challenge the publishers’ narrative. A paper from researchers at the University of California and Los Alamos Laboratory is particularly relevant here. It appeared first on arXiv.org in 2016 (pdf), but has only just been “officially” published (paywall). It does something really obvious but also extremely valuable: it takes around 12,000 academic papers as they were originally released in their preprint form, and compares them in detail with the final version that appears in the professional journals, sometimes years later, as the paper’s own history demonstrates. The results are unequivocal:

We apply five different similarity measures to individual extracted sections from the articles’ full text contents and analyze their results. We have shown that, within the boundaries of our corpus, there are no significant differences in aggregate between pre-prints and their corresponding final published versions. In addition, the vast majority of pre-prints (90%-95%) are published by the open access pre-print service first and later by a commercial publisher.

That is, for the papers considered, which were taken from the arXiv.org preprint repository, and compared with the final versions that appeared, mostly in journals published by Elsevier, there were rarely any important additions. That applies to titles, abstracts and the main body of the articles. The five metrics applied looked at letter-by-letter changes between the two versions, as well as more subtle semantic differences. All five agreed that the publishers made almost no changes to the initial preprint, which nearly always appeared before the published version, minimizing the possibility that the preprint merely reflected the edited version.

The authors of the paper point out a number of ways in which their research could be improved and extended. For example, the reference section of papers before and after editing was not compared, so it is possible that academic publishers add more value in this section; the researchers plan to investigate this aspect. Similarly, since the arXiv.org papers are heavily slanted towards physics, mathematics, statistics, and computer science, further work will look at articles from other fields, such as economics and biology.

Such caveats aside, this is an important result that has not received the attention it deserves. It provides hard evidence of something that many have long felt: that academic publishers add almost nothing during the process of disseminating research in their high-profile products. The implications are that libraries should not be paying for expensive subscriptions to academic journals, but simply providing access to the equivalent preprints, which offer almost identical texts free of charge, and that researchers should concentrate on preprints, and forget about journals. Of course, that means that academic institutions must do the same when it comes to evaluating the publications of scholars applying for posts.

If it was felt that more user-friendly formats were needed than the somewhat austere preprints, it would be enough for funding organizations to pay third-party design companies to take the preprint texts as-is, and simply reformat them in a more attractive way. Given the relatively straightforward skills required, the costs of doing so would be far less than paying high page charges, which is the main model used to fund so-called “gold” open access journals, as opposed to the “green” open access based on preprints freely available from repositories.

In theory, gold open access offers “better” quality texts than green open access, which supposedly justifies the higher cost of the former. What the research shows is that when it comes to academic publishing, as in many other spheres, all that glitters is not gold: humble preprints turn out to be almost identical to the articles later published in big-name journals, but available sooner, and much more cheaply.

Follow me @glynmoody on Twitter or identi.ca, and +glynmoody on Google+

Filed Under: , , , ,

Rate this comment as insightful
Rate this comment as funny
You have rated this comment as insightful
You have rated this comment as funny
Flag this comment as abusive/trolling/spam
You have flagged this comment
The first word has already been claimed
The last word has already been claimed
Insightful Lightbulb icon Funny Laughing icon Abusive/trolling/spam Flag icon Insightful badge Lightbulb icon Funny badge Laughing icon Comments icon

Comments on “Research Shows That Published Versions Of Papers In Costly Academic Titles Add Almost Nothing To The Freely-Available Preprints They Are Based On”

Subscribe: RSS Leave a comment
24 Comments
Peter (profile) says:

Are there any models where libraries or funding agencies ...

… sponsor open-access platforms? In the Bioinformatics/Systems biology arena, it used to be common practice for industry and funding organizations to sponsor personnel, technology development and platforms. Their reasoning would apply to open-access in same way: The community needs certain tools and services. Sponsoring open systems was considered to be cheaper than licensing closed systems.

The prerquisite is, of course, to accept that open-source is not free, but requires an (up-front) investment of some of the money saved from paid subscriptions (later).

Monica says:

Re: Are there any models where libraries or funding agencies ...

The short answer to your question is yes. Many research libraries pay to support Cornell University Library, who maintain the arXiv.org repository. Many more also provide their researchers with open-access repository services using open source tools such as DSpace and Fedora. And a growing number of libraries offer publishing services, enabling the publication of open access journals. Universities also support organizations like the Center for Open Science, which provides a range of platforms for preprints and data, by way of institutional memberships.

In short, university libraries are paying for both the subscription journals and the services and platforms that enable open-access publishing and data sharing. While some have suggested that we cancel subscriptions and channel those funds to more support for OA and open source platforms, to my knowledge nobody has done this in any large-scale way.

kmo12345 (profile) says:

Somewhat misleading

I am a physicist and have a couple issues with this conclusion. I think the general claim that the editors of journals add very little to the published work is probably correct. However, the claim that there are no significant changes between the pre-print version and the published version is rubbish.

My most recent paper, which was just accepted for publishing in Physical Review B, was sent out to two referees. Referee A had not a whole lot to say but pointed out an explanation that we had provided for an method was unclear to non-experts. Referee B was perhaps overly thorough but actually pointed out a few instances where specific word choices could lead to the incorrect conclusions being made. He or she also found a couple minor stylistic errors that had slipped through our editing.

While the total number of words changed was probably under 5% (maybe even closer to 1 or 2%), the revised manuscript is certainly better than the pre-print version.

The next step is for the journal to copy-edit the manuscript. This step usually consists of changing British English to American English and spelling out some abbreviations or abbreviating other words but can sometimes uncover typos that made it through peer editing. In any case, I agree that this is less useful than the peer review.

It is certainly ridiculous that the public has to triple pay for research (they pay me to do it, they pay for me to access and submit to journals, and they have to pay if they want to access the research). However, I have yet to see an alternative to the current peer review process that is facilitated by the journals.

In many cases, I have encountered papers on the arXiv which are completely incorrect. The papers in question have not been published and likely wouldn’t be published without significant changes. I myself have manuscripts on the arXiv that contain small errors which have been fixed in the published versions. Depending on the journal we can usually replace the pre-print version with the published version after some length of time (6 months I believe) but this is not always done.

Anonymous Coward says:

Re: Somewhat misleading

As the peer review process is provided by academia, they could come together to manage that process for themselves. It is not like the peer reviewers are employed by the journals, or paid to carry out that work. Indeed a Wikipedia provides the basis system for publication and peer review.

The outstanding problem is the use of publications in prestigious as a measure of academic ability when academics seek new posts.

Prashanth (profile) says:

Re: Somewhat misleading

Indeed, there is a lot of work in sifting through what people submit to whittle down to work that is worthy of publication (though that is not to say that there aren’t problems in how those determinations are made, nor that there aren’t problems in using that to justify the obscenely high prices of journals, because there are).

Yes, I know I'm commenting anonymously says:

It is not that the publishers improve the quality of the papers that is at issue but that they publish important journals: Writers get more funding points from their bosses for publishing in prestigeous journals. These journals happen to be owned by the big publishers and this is where the publishers have their value.

The way forward is to have a transparant system to rank the importance of papers that does not depend on the chosen magazine. This way, there can be a lot of papers in any open access repository without having the importance of the few critical articles being watered down.

Annonymouse (profile) says:

Really 95% of the tools and resources are already there.
All that is needed is just two things.
First as already pointed out is an open platform equivalent to wiki that has the logistics hammered out and middlemen proof.
Second is to beat the various admins and funding bodies about the head and shoulders until they get into their syphilis addled heads to stop looking at the colour of the covers and actually to their jobs.

Anonymous Coward says:

Re: Re:

All that is needed is just two things.
First as already pointed out is an open platform equivalent to wiki that has the logistics hammered out and middlemen proof.

What prevents a wiki from being set up for this purpose? Which new software features are required, and has anyone written or requested them?

Would Reddit or StackOverflow-style software be better suited? They have voting, comments etc.

Anonymous Coward says:

Isn’t the main value of the editorial process determining WHICH papers to publish? It’s not surprising that comparing earlier and final versions of the papers chosen don’t show much of a difference.

What would be more interesting is a study comparing the papers published to those NOT published by some metrics of quality. In other words, measure if the journals are performing a valuable “gatekeeper” function or not.

Anonymous Coward says:

One of the ironies of this...

…is that a single deep-pocketed investor could end this entire farce in a single day. Drop a billion dollars on a foundation (900M in endowment, 100M in operating capital) and go full open-access with all research. It would be an enormous service to humanity and it would crush the Elsevier’s of the world out of existence (good riddance to them).

There are people who could do this without even blinking. And while there are numerous other worthy causes, making all academic knowledge free would serve those too — maybe not today, but certainly in the future.

Anonymous Coward says:

Re: One of the ironies of this...

Even if all future research was published open access, the Elsevier’s of the world would linger because they hold the copyrights on a large number of foundation and important research papers in many subject areas. Those companies would have to be bought out to free all the existing papers to truly stop their blood sucking on academic research.

Add Your Comment

Your email address will not be published. Required fields are marked *

Have a Techdirt Account? Sign in now. Want one? Register here

Comment Options:

Make this the or (get credits or sign in to see balance) what's this?

What's this?

Techdirt community members with Techdirt Credits can spotlight a comment as either the "First Word" or "Last Word" on a particular comment thread. Credits can be purchased at the Techdirt Insider Shop »

Follow Techdirt

Techdirt Daily Newsletter

Ctrl-Alt-Speech

A weekly news podcast from
Mike Masnick & Ben Whitelaw

Subscribe now to Ctrl-Alt-Speech »
Techdirt Deals
Techdirt Insider Discord
The latest chatter on the Techdirt Insider Discord channel...
Loading...