ABCs of KMArtificial intelligenceBrain powerFeatured Stories

Data governance practices for using RAG in generative AI-powered information systems supporting knowledge management (KM)

This article is part of an ongoing series looking at AI in KM, and KM in AI.

Retrieval-augmented generation (RAG)1 improves large language model (LLM) outputs by incorporating external information retrieval before generating responses. Unlike LLMs alone that rely on static training data, RAG pulls relevant text from databases, uploaded documents, or web sources. Because of this, RAG is being increasingly used in the generative AI-powered information technology systems that are supporting knowledge management (KM).

A recent RealKM Magazine reported on research looking at how LLM outputs can be improved even further through GraphRAG2 which also incorporates knowledge graphs.

A new study3 published in the Proceedings of the 59th Hawaii International Conference on System Sciences examines another important dimension of RAG use. This is that RAG introduces data-related risks including issues of quality, trust, and compliance. To address these risks, study authors Tilman Friedrich, Karl Akbari, and Daniel Fu¨rstenau set out to identify which data governance practices contribute to the success of generative AI-powered information systems using RAG. They broadly define data governance as the specification of rights, responsibilities, and controls over data assets.

Because of the newness of enterprise RAG applications and the limited availability of research, Friedrich and colleagues adopt a qualitative multiple case study design, following the eight-step method for building theories from case study research described in Kathleen M. Eisenhardt’s seminal 1989 article4 in The Academy of Management Review.

Findings

The study findings are summarised in Figure 1 below. Friedrich and colleagues advise that their findings suggest a strong link between data governance practices and information system success across the three quality dimensions of system, knowledge, and service quality, via six mechanisms:

  1. Amplifying data discoverability during data curation.
  2. Formalising quality assurance for knowledge content.
  3. Defining metadata to enable additional source integration.
  4. Enabling data enrichment for optimised retrieval.
  5. Providing guidance for handling sensitive data.
  6. Enforcing and monitoring data protection at the system level.

The contextual drivers shaping the adoption of these practices are the combination and interaction of dynamically evolving technology-induced requirements (particularly stemming from the reliance on semi-structured data) and contextual data-related challenges. While they enhance knowledge, system, and service quality, they also introduce strategic and operational trade-offs, notably due to their resource intensity and constraints on data integration for the sake of data protection.

Given the importance of high-quality proprietary unstructured data for generative AI value creation, governance becomes a strategic enabler, enhancing content richness, retrieval performance, and compliance.

Friedrich and colleagues also report that two findings diverge from expectations. Firstly, no practices were observed to align or structure data at source for improved pipeline integration, indicating that organizations prefer post-hoc transformation through technical architectures rather than proactive data design. Secondly, while procedural mechanisms (e.g., checks, classifications) dominate, relational mechanisms such as training or awareness-building are almost entirely absent. This under-representation suggests that relational mechanisms remain overlooked or under-reported in enterprise generative AI initiatives.

Overview of data governance practices, their drivers, contribution to KMS success and trade-offs.
Figure 1. Overview of data governance practices, their drivers, contribution to KMS success and trade-offs (source: Friedrich et al., 2026).

Header image source: Turing Commons, CC BY-SA 4.0.

References:

  1. Wikipedia, CC BY-SA 4.0.
  2. Naganawa, H., Hirata, E., & Yamada, A. (2025). Implementing a Knowledge Management System with GraphRAG: A Physical Internet Example. Electronics, 14(24), 4948.
  3. Friedrich, T., Akbari, K., & Fürstenau, D. (2026). Data Governance Practices for Generative AI Powered Organizational Knowledge Management Systems Using Retrieval Augmented Generation. Proceedings of the 59th Hawaii International Conference on System Sciences.
  4. Eisenhardt, K. M. (1989). Building theories from case study research. Academy of management review, 14(4), 532-550.

Bruce Boyes

Bruce Boyes is editor, lead writer, and a director of RealKM Magazine and winner of the International Knowledge Management Award 2025 (Individual Category). He is an experienced knowledge manager, environmental manager, project manager, communicator, and educator, and holds a Master of Environmental Management with Distinction and a Certificate of Technology (Electronics). His many career highlights include: establishing RealKM Magazine as an award-winning resource with more than 2,500 articles and 5 million reader views, leading the knowledge management (KM) community KM and Sustainable Development Goals (SDGs) initiative, using agile approaches to oversee the on time and under budget implementation of an award-winning $77.4 million recovery program for one of Australia's iconic river systems, leading a knowledge strategy process for Australia’s 56 natural resource management (NRM) regional organisations, pioneering collaborative learning and governance approaches to empower communities to sustainably manage landscapes and catchments in the face of complexity, being one of the first to join a new landmark aviation complexity initiative, initiating and teaching two new knowledge management subjects at Shanxi University in China, and writing numerous notable environmental strategies, reports, and other works.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button