
Advances & challenges in foundation agents: Section 1.3.4 – Connections to existing theories
This article is Chapter 1, Section 1.3.4 of a series of articles featuring Liu and colleagues’ book Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems.
Liu and colleagues’ foundation-agent framework isn’t created from scratch; rather, it builds upon and synthesizes several influential theories from AI, cognitive science, and neuroscience. Here, these connections are clarified explicitly, highlighting both the similarities and critical enhancements introduced.
Classic Perception–Cognition–Action Cycle. The traditional AI perspective1 sees agents as engaging in a repeated loop: sensing the environment, thinking about it, and then acting accordingly. Liu and colleagues’ framework directly extends this basic cycle by incorporating richer cognitive machinery: explicit attentional control within perception (P), fine-grained internal states such as memory, emotion, and goals within cognition (C), and reward signals that evolve dynamically. This deeper granularity helps clarify how internal states guide perception and cognition, making it easier to understand and engineer adaptive agent behaviors.
Minsky’s Society of Mind. Marvin Minsky famously proposed2 that intelligence emerges from interactions among numerous specialized internal agents, each performing simpler tasks yet collectively producing complex cognition. Liu and colleagues’ modular subcomponents—memory (Mtmem), world model (Mtwm), emotion (Mtemo), goals (Mtgoal), and rewards (Mtrew)—echo this idea, representing a cooperative society of distinct yet interdependent cognitive modules. Moreover, recent work on language-based agent societies, such as the Mindstorms paradigm3, supports extending Minsky’s internal “societies” into externally interacting communities, mirroring our emphasis on multi-agent and socially structured intelligence.
Buzsáki’s Inside-Out Perspective. The neuroscientist György Buzsáki argues4 that brains actively construct perceptions rather than passively registering them from the outside world. In Liu and colleagues’ model, perception is explicitly influenced by prior mental states (Mt-1), including emotions, goals, and reward expectations. This active construction of perception means our agents, like human brains, continuously refine their understanding of the environment based on internal states and past experiences, embodying the inside-out perspective in a clear computational form.
Generalizing the POMDP Framework. Partially Observable Markov Decision Processes (POMDPs) have long provided a robust mathematical formulation for modeling agents under uncertainty. Classical POMDPs, however, rely on probabilistic transitions between environmental states and typically use externally defined scalar rewards. Liu and colleagues’ framework significantly generalizes the POMDP structure in several ways: i) Flexible State Transitions: Unlike classical POMDPs constrained to probabilistic transitions, Liu and colleagues’ environment transition function (T) allows both deterministic and stochastic mappings without predefined limitations, increasing modeling versatility. ii) Internalized Reward: Instead of relying on external scalar rewards, Liu and colleagues’ embed reward signals within the agent’s internal mental state (Mtrew). This embedding allows rewards to dynamically evolve and interact with emotions, goals, and memory, reflecting a more realistic and nuanced motivational system. iii) Expanded Decision-Making: Traditional POMDPs use a straightforward value-maximization policy. By contrast, Liu and colleagues’ reasoning mechanism explicitly incorporates emotions, memories, and goals into decisions, accommodating richer, more nuanced behavioral strategies, including heuristic and socially influenced choices. iv) Modular Mental States: Classical POMDPs collapse internal states into a singular belief representation. Liu and colleagues’ explicit modeling of separate cognitive modules (memory, emotion, etc.) significantly enhances transparency and interpretability, aligning closer to biological plausibility.
Thus, while Liu and colleagues’ framework includes the classical POMDP as a simplified special case, it notably broadens the scope of possible agent behaviors, providing richer modeling capabilities.
Active Inference and the Bayesian Brain. Karl Friston’s active inference framework5 posits that intelligent agents continually update internal models to minimize discrepancies between expected and observed outcomes, reducing “surprise” or free energy. This predictive perspective resonates deeply with Liu and colleagues’ model. Liu and colleagues’ world model (Mtwm), alongside goal and reward components, continually refines predictions about future environmental states, enabling the agent to anticipate and adapt proactively. Decision-making, planning, and action selection then explicitly aim to reduce surprise by aligning internal expectations with observed reality, mirroring the Bayesian-brain perspective in a structured computational form.
Biological Plausibility and Computational Flexibility. Throughout these theoretical connections, Liu and colleagues’ framework prioritizes two guiding principles: biological plausibility and computational generality. While each submodule aligns clearly with neuroscientific analogues—memory with hippocampal-cortical interplay, emotion with limbic function, reasoning with prefrontal cortical circuits—these analogies inspire rather than rigidly constrain implementation. Liu and colleagues’ modules remain agnostic to specific computational realizations, easily accommodating neural networks, symbolic logic, probabilistic models, or hybrid methods. This openness preserves flexibility, enabling diverse implementations that remain faithful to core cognitive principles without artificial constraints.
By explicitly situating their framework among these influential theories, Liu and colleagues’ achieve clarity on how each aspect of their design contributes distinctively and why integrating these aspects yields a powerful, flexible, and biologically inspired agent architecture. These connections not only clarify theoretical positioning but also serve as guideposts for future enhancements, ensuring our approach remains both grounded in rigorous science and open to ongoing innovation.
Next part: Section 1.4 – Navigating this series.
Article source: Liu, B., Li, X., Zhang, J., Wang, J., He, T., Hong, S., … & Wu, C. (2025). Advances and challenges in foundation agents: From brain-inspired intelligence to evolutionary, collaborative, and safe systems. arXiv preprint arXiv:2504.01990. CC BY-NC-SA 4.0.
Header image: AI is Everywhere by Ariyana Ahmad & The Bigger Picture / Better Images of AI, CC BY 4.0.
References:
- Stuart J. Russell and Peter Norvig. Artificial Intelligence: A Modern Approach. Prentice Hall, Englewood Cliffs, NJ, 1 edition, 1995. ISBN 0-13-103805-2. ↩
- Marvin Minsky. Society of Mind. Simon and Schuster, 1988. ↩
- Mingchen Zhuge, Haozhe Liu, Francesco Faccio, Dylan R Ashley, Róbert Csordás, Anand Gopalakrishnan, Abdullah Hamdi, Hasan Abed Al Kader Hammoud, Vincent Herrmann, Kazuki Irie, et al. Mindstorms in Natural Language-Based Societies of Mind. arXiv preprint arXiv:2305.17066, 2023. ↩
- Gyorgy Buzsaki. The brain from Inside Out. Oxford University Press, USA, 2019. ↩
- Karl J Friston, Jean Daunizeau, James Kilner, and Stefan J Kiebel. Action and behavior: a free-energy formulation. Biological Cybernetics, 102:227–260, 2010. ↩




