
Advances & challenges in foundation agents: Section 1.3.2 – Core concepts and notations in the agent loop
This article is Chapter 1, Section 1.3.2 of a series of articles featuring Liu and colleagues’ book Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems.
Liu and colleagues’ agent framework architecture operates at three conceptual levels: society, environment, and agent. The agent is then decomposed into three main subsystems: perception, cognition, and action. Within cognition, key submodules are identified: memory, world model, emotional state, goals, reward, learning, and reasoning processes (including “planning” and “decision-making” as special actions produced with reasoning). Attention is primarily handled within perception and cognition. Before presenting the formal loop, the symbols used are summarize in Table 1.2.
Table 1.2: Notation summary for the revised agent framework, highlighting separate learning and reasoning functions within the overall cognition process (source: Liu et al., 2025).
In the following, based on the notations in Table 1.2, Liu and colleagues’ proposed agent loop is presented.
Figure 1.2 illustrates Liu and colleagues’ agent framework, presenting the core concepts and different types of information or control flows among them. Until now, a brain-inspired agent framework that integrates biological insights into a formal perception–cognition–action loop has been presented. By decomposing cognition into modules for memory, world modeling, emotion, goals, reward-based learning, and reasoning, essential parallels with the human brain’s hierarchical and reward-driven processes are captured. Critically, attention is included in the loop to enable selective filtering based on internal states. Furthermore, planning and decision-making can be viewed as distinct internal (mental) actions that either refine internal representations or select external behaviors. The framework naturally extends classical agent architectures, providing a multi-level structure that integrates emotional and rational processes as well as robust, reward-driven learning across short and long timescales.

Society and social systems. In many real-world scenarios, agents do not merely interact with a static environment but operate within a broader society, comprising various social systems such as financial markets, legal frameworks, political institutions, educational networks, and cultural norms. These structures shape and constrain agents’ behaviors by defining rules, incentives, and shared resources. For example, a financial system dictates how economic transactions and resource allocations occur, while a political system provides governance mechanisms and regulatory constraints. Together, these social systems create a layered context in which agents must adaptively learn, reason, and act, both to satisfy their internal goals and to comply (or strategically engage) with external societal rules. In turn, the actions of these agents feed back into the social systems, potentially altering norms, policies, or resource distributions.
A formal definition of foundation agents. Building on these insights and the vision of robust, adaptive intelligence, Liu and colleagues’ concept of a foundation agent is now formally introduced. Unlike traditional agent definitions that focus primarily on immediate sensory-action loops, a foundation agent embodies sustained autonomy, adaptability, and purposeful behavior, emphasizing the integration of internal cognitive processes across diverse environments.
Unlike classical definitions, which often frame agents primarily in terms of simple perception–action loops (“perceive and act”1), Liu and colleagues’ notion of foundation agents emphasizes the depth and integration of internal cognitive processes. In contrast to prior work2 that defines “foundation agents” as generalist decision models emphasizing unified representations, policy interfaces, and interactive learning across tasks, Liu and colleagues’ conception highlights a deeper, brain-inspired integration, explicitly modeling mental states and goal-directed reasoning to mirror biological cognition more completely. Foundation agents in this new framework not only perceive their environment and perform immediate actions but also possess an evolving, goal-oriented cognition, continuously adapting memory structures, world models, emotional and reward states, and autonomously refining their strategies through reasoning. This internal cognitive richness allows foundation agents to autonomously decompose complex, abstract goals into actionable tasks, strategically explore their environments, and dynamically adjust their behavior and cognitive resources. The unified perception–cognition–action framework thus accommodates and explicitly models these sophisticated cognitive capabilities, recognizing internal (mental) actions on par with external (physical or digital) interactions, facilitating a broad range of embodiments, from physical robots to software-based or purely textual intelligent agents.
Next part: Section 1.3.3 – Biological inspirations.
Article source: Liu, B., Li, X., Zhang, J., Wang, J., He, T., Hong, S., … & Wu, C. (2025). Advances and challenges in foundation agents: From brain-inspired intelligence to evolutionary, collaborative, and safe systems. arXiv preprint arXiv:2504.01990. CC BY-NC-SA 4.0.
Header image: AI is Everywhere by Ariyana Ahmad & The Bigger Picture / Better Images of AI, CC BY 4.0.
References:
- Stuart J. Russell and Peter Norvig. Artificial Intelligence: A Modern Approach. Prentice Hall, Englewood Cliffs, NJ, 1 edition, 1995. ISBN 0-13-103805-2. ↩
- Xiaoqian Liu, Xingzhou Lou, Jianbin Jiao, and Junge Zhang. Position: Foundation Agents as the Paradigm Shift for Decision Making. arXiv preprint arXiv:2405.17009, 2024. ↩







