AI Research Shifts Focus from LLMs to World Models for Embodied Intelligence

Edited by: firstname lastname

The strategic direction of Artificial General Intelligence (AGI) research in 2025 is characterized by a notable pivot away from Large Language Models (LLMs) toward the creation of 'world models.' This redirection stems from the recognized constraints of LLMs, which, despite their advanced text generation capabilities, often lack a grounded comprehension of physical laws, causality, and three-dimensional spatial reasoning. World models are being developed to simulate and reason about the physical environment, establishing the necessary groundwork for AI systems to evolve from abstract knowledge processing to tangible, real-world interaction, a concept known as embodied intelligence.

Cognitive scientist Gary Marcus has consistently highlighted the inherent fragilities in reliability and complex reasoning within LLMs, advocating for neuro-symbolic architectures that explicitly incorporate fundamental world rules. Marcus suggests that deep learning alone is insufficient for robust generalization, especially with critical or unusual data, underscoring the need for systems to internalize conceptual models of the external world through abstraction. This viewpoint supports the development of hybrid systems that merge neural pattern recognition with classical symbolic operations to achieve more resilient intelligence.

Key industry figures are actively shaping this consensus. Fei-Fei Li, founder of World Labs established in 2024, introduced Marble, a commercial world model focused on spatial intelligence. Li asserts that achieving AGI requires elevating perception to action, identifying the modeling of the three-dimensional world and its dynamics as a fundamental challenge surpassing language modeling. World Labs positions spatial intelligence as the next essential layer in AI development, moving beyond static data processing.

Major technology companies are also intensifying investment in this area. Google DeepMind is advancing its simulation capacity with Genie 3, a general-purpose world model capable of generating diverse, interactive 3D environments from text prompts in real-time at 720p resolution and 24 frames per second. Genie 3 incorporates world memory to sustain state consistency for several minutes, offering a crucial training environment for embodied agents to learn through simulated experience, marking a significant step toward AGI.

The shift away from an LLM-exclusive path is further evidenced by high-profile industry movements. Turing Award laureate Yann LeCun departed his role at Meta at the end of 2025 to launch Advanced Machine Intelligence (AMI), a startup exclusively dedicated to building world models. LeCun views the current LLM scaling approach as a potential dead end for embodied AI, favoring systems capable of reasoning about cause and effect in the physical world. This collective movement toward embodied intelligence, facilitated by sophisticated world models, signals a clear industry demand for AI systems that can perceive, predict, and interact within complex, dynamic physical environments rather than merely predicting text.

72 Views

Sources

  • Marketplace

  • The Guardian

  • Nasdaq

  • Observer

  • Observer

Did you find an error or inaccuracy?We will consider your comments as soon as possible.