Analysis & counterpointsOpen access to scholarly knowledge

Open access to scholarly knowledge in the digital era (chapter 5.4): Toward linked open data for Latin America

This article is chapter 5.4 in section 5 of a series of articles summarising the book Reassembling Scholarly Communications: Histories, Infrastructures, and Global Politics of Open Access.

In the fourth chapter of the infrastructures and platforms section, Arianna Becerril-García and Eduardo Aguado-López detail the ways in which infrastructural improvements could result in greater discoverability and integration of South American research cultures within broader global databases.

Scholarly communication has seized the opportunity to broaden inclusion through the use of information technologies. Open access holds out the promise of a global scientific dialogue. Globalization has indeed become the ultimate goal in scientific practice, with journals’ presences in mainstream databases such as Web of Science (WoS) and Scopus being equated with global visibility.

However, Latin America, like many other developing regions, has historically faced a lack of visibility and recognition for the science that it generates. This is mainly due to the scarce presence of Latin American journals in the aforementioned mainstream databases. Indeed, as shown in Figure 1, only 276 Latin American journals are indexed by WoS and 795 by Scopus, whereas in Latin American platform Redalyc there are 1,111.

Latin American journals indexed by Redalyc, Scopus, and WoS.
Figure 1. Latin American journals indexed by Redalyc, Scopus, and WoS (source: Becerril-García and Aguado-López, 2020).

Latin American scholarly journals are led, owned, and financed by academic institutions, and are available to everyone. This system is neither formalized nor made explicit, but was already operational before the term “open access” was even coined. This Latin American ecosystem is composed of several layers, including university presses, platforms such as CLACSO, Redalyc, SciELO, and Latindex, and services such as interoperability, search engine optimization, metrics, usage tracking, and XML typesetting under the JATS standard.

Latin America has relied upon open access as its path to inclusion in a more participatory worldwide scholarly system, but Becerril-García and Aguado-López alert that the inertial dependencies of traditional legitimation circuits remain, and commercial open-access strategies from the Global North threaten to rupture the Latin American OA nonprofit ecosystem.

Hence, openness is not enough, we have to modify systems of research assessment and find more effective methods of communicating the knowledge generated in different regions, disciplinary fields, and languages.

Technology for visibility, discoverability, and Internationalization

Becerril-García and Aguado-López contend that technological innovations can contribute to a more integrated knowledge ecosystem, including semantic technologies, artificial intelligence techniques, ontological engineering, natural language processing, machine learning, and other advancements.

Interoperability is an important area in which technological developments have already been applied. What if interoperability principles could be applied to scholarly communication in terms of the interchange of research results across geographical regions, disciplines, or even languages?

The OAI-PMH data model provides a basic semantic level for understanding the nature of described resources, but only at an identification level. XML allows for the structuring of full texts of scholarly resources and facilitates future machine-reading possibilities. SciELO began adopting XML in 2012, and Redalyc started adopting XML in 2015. Currently, 90 percent of journals indexed by Redalyc publish their content in XML JATS.

While XML in journals carries great potential, a deeper and more relational level of granularity at which information could be disseminated exists. Becerril-García and Aguado-López advise that a transition from a machine-readable to machine-comprehensible paradigm with respect to scholarly information resources is needed.

For instance, the information content of scholarly outputs could be represented as connections of informational elements where the structure, formed by nodes and connections, expresses knowledge. That form of structuration, though, goes far beyond the capabilities of XML, whose data model is a tree. Becerril-García and Aguado-López argue that a far better data model for knowledge representation is a graph, as provided by a resource description framework (RDF).

Leveraging semantic technologies to achieve a global research dialogue

The “HowOpenIsIt?®” Open Access Spectrum guide provides a scale for machine readability of OA content that includes a notion of semantics that has not yet been achieved by Latin American journals. The web as it currently exists, in the form of hypertext, has minimal structuring and semantics. Semantics, however, has great potential to enable scholarly resources to join the so-called Web of Data.

Becerril-García and Aguado-López have applied semantic technologies to structured scholarly resources and created a semantic model for selective knowledge discovery dubbed “OntoOAI”. This model allows users to explore and browse information following relations at different levels, which adds value for discoverability purposes.

OntoOAI’s application verified the feasibility of using semantic technologies to achieve selective knowledge discovery, while also showing some limitations of using OAI-PMH data for this purpose, such as the lack of URIs and full-text structuration. The latter would enable a journal article or other scholarly resource to be broken down into pieces that individually would form nodes in a graph whose relations among them are represented as edges and together they might be expressed in an ontology. RDF based on JATS could also work to achieve that task (Figure 2). Indeed, if this lack of URIs and RDF availability are overcome by Latin American scholarly resources, all this information could be part of the Linked Open Data (LOD) Cloud.

Knowledge representation of a journal article (RDF derived from JATS XML) based on the representation of the Linked Open Data Cloud.
Figure 2. Knowledge representation of a journal article (RDF derived from JATS XML) based on the representation of the Linked Open Data Cloud (source: Becerril-García and Aguado-López, 2020).

This would mean that every piece of information published by scholarly journals in Latin America could be linked to all data provided by all other LOD sources (Figure 3). If we had such semantic markups within our systems of scholarly communications, we could query, extract, infer, and retrieve information in such a way that published knowledge per se could reach visibility, discoverability, and internationalization. Thus, traditional circuits of scholarly communication, the ones legitimated by current research assessment strategies, could be left behind. Information could speak by itself in benefit of a global science communication.

In concluding their chapter, Becerril-García and Aguado-López acknowledge that many will see their technological solution as overly optimistic. After all, most difficult problems have social, rather than technological, answers. Yet they affirm their belief in the potentially liberatory powers of information technologies.

Journal articles as part of the Linked Open Data Cloud.
Figure 3. Journal articles as part of the Linked Open Data Cloud (source: Becerril-García and Aguado-López, 2020).

Next part (chapter 5.5): The pasts, presents, and futures of SciELO.

Article source: This article is an edited summary of Chapter 201 of the book Reassembling scholarly communications: Histories, infrastructures, and global politics of Open Access2 which has been published by MIT Press under a CC BY 4.0 Creative Commons license.

Acknowledgements: This summary was drafted by Wordtune Read with further corrections and edits by Bruce Boyes.

Article license: This article is published under a CC BY 4.0 Creative Commons license.

References:

  1. Becerril-García, A., & Aguado-López, E. (2020). Toward Linked Open Data for Latin America. In Eve, M. P., & Gray, J. (Eds.) Reassembling scholarly communications: Histories, infrastructures, and global politics of Open Access. MIT Press.
  2. Eve, M. P., & Gray, J. (Eds.) (2020). Reassembling scholarly communications: Histories, infrastructures, and global politics of Open Access. MIT Press.
Rate this post

Bruce Boyes

Bruce Boyes (www.bruceboyes.info) is a knowledge management (KM), environmental management, and education professional with over 30 years of experience in Australia and China. His work has received high-level acclaim and been recognised through a number of significant awards. He is currently a PhD candidate in the Knowledge, Technology and Innovation Group at Wageningen University and Research, and holds a Master of Environmental Management with Distinction. He is also the editor, lead writer, and a director of the award-winning RealKM Magazine (www.realkm.com), and teaches in the Beijing Foreign Studies University (BFSU) Certified High-school Program (CHP).

Related Articles

Back to top button