Beijing Academy of Artificial Intelligence (BAAI) Releases Physis General World Foundation Model
2026-06-15 14:34
Favorite

en.Wedoany.com Reported - On June 12, the 8th Beijing BAAI Conference was held in Beijing, where the Beijing Academy of Artificial Intelligence (BAAI) released the general world foundation model, Physis-v0.1. This model is designed for modeling the real physical world, emphasizing physical correctness, traceable action causality, long-term consistency, and general generalization capabilities. It can be adapted to real physical application scenarios such as robotics, video generation, gaming, and industry, providing underlying support for embodied intelligence and industrial intelligent systems.

This release pushes the world model to a more foundational position. Large language models excel at text understanding and reasoning, while multimodal models further connect images, speech, and video. However, robotics, industrial simulation, autonomous driving, intelligent manufacturing, and complex spatial tasks require not just "understanding the scene," but also comprehending how objects move, how actions produce results, and whether environmental changes conform to physical laws. The positioning of Physis-v0.1 is precisely to extend the model from digital content generation towards the prediction and interaction capabilities of the physical world.

The difficulty of world models lies in continuity. A video clip may appear clear, but if object motion defies gravity, collision relationships are inconsistent, or action causality cannot be traced, it becomes difficult to serve real-world robots and industrial scenarios. For embodied intelligence, robots need to judge the consequences of actions before executing tasks; for industrial applications, the model needs to maintain consistent reasoning across production processes, equipment operation, material changes, and spatial constraints. Physis-v0.1 emphasizes long-term consistency and traceable causality, indicating that the model's goal is not merely to generate more realistic images, but to support verifiable, executable, and transferable physical reasoning.

BAAI also simultaneously released other achievements, including the multimodal neuroscience large model Brainμ1.0, as well as progress in agents, foundational software and hardware ecosystems, and open-source ecosystem development. This gives the "Physis" system a clearer multi-directional layout: one direction targets the physical world and embodied intelligence, another connects to brain science and life sciences, and a third supports application expansion through agents and software/hardware ecosystems. For artificial intelligence basic research institutions, this combination signifies a shift in research focus from single model capabilities to the systematic construction of models, data, agents, platforms, and open-source ecosystems.

Physis-v0.1 is particularly important for the robotics industry. Currently, humanoid robots and mobile manipulation robots can already perform tasks like grasping, transporting, inspecting, and pharmacy order picking. However, the real limitation to large-scale deployment is long-term stability and generalization ability in complex environments. Robots cannot rely solely on preset programs to operate in fixed scenarios; they need to understand the relationships between desktops, shelves, tools, doors, liquids, deformable objects, and human actions. If a general world model can provide more reliable physical prediction capabilities, it will help robots reduce trial-and-error costs in training, simulation, task planning, and anomaly recovery.

In the industrial sector, world models could also become a new foundation for digital twins and intelligent manufacturing. Traditional industrial simulation typically relies on explicit rules, parameters, and engineering models, suitable for specific equipment or processes, but with limited cross-scenario transferability. If a general world foundation model can learn common patterns across different physical systems, it could be used in the future for production line planning, equipment state inference, process parameter optimization, industrial video understanding, and safety risk prediction. For manufacturing enterprises, the value of such a model is not just "generating images," but helping systems predict the consequences of a specific action, process, or environmental change in advance.

Gaming and video generation scenarios provide another validation path. High-quality content generation requires realistic images, but more advanced generation requires coherent physical processes, such as consistent character motion, object collisions, lighting changes, fluid flow, mechanical movement, and spatial relationships. If Physis-v0.1 can maintain physical plausibility in these scenarios, it could drive content production from short clip generation towards the generation of interactive, controllable, and continuously evolving virtual worlds. This would also allow the world model to serve both the digital content industry and embodied intelligence training systems.

This release also has implications for the open-source ecosystem. BAAI has long been promoting construction around large models, datasets, evaluation systems, and open-source technology foundations. If the general world model is linked with open-source data, evaluation platforms, agent frameworks, and foundational software/hardware ecosystems, it will help lower the barrier for universities, research institutions, and industry teams to enter the field of world model research. For China's AI industry, foundational model capabilities require breakthroughs from leading teams, but also an open ecosystem that allows more developers to form application validations around robotics, industry, scientific research, and content generation.

Subsequent milestones mainly depend on three aspects: first, whether Physis-v0.1 will open-source models, data, interfaces, or evaluation tools, allowing external teams to verify its physical consistency and generalization capabilities; second, whether pilot applications in scenarios like robotics, industry, gaming, and video generation can form reproducible cases; third, whether neuroscience models like Brainμ1.0 can form deeper connections with the world model system, promoting AI from language and visual intelligence into interdisciplinary research of the physical world and life sciences. If these directions continue to advance, BAAI's release this time will not just be a model update, but could become an important milestone in the construction of China's general world model and embodied intelligence underlying technology system.

This article is compiled by Wedoany. All AI citations must indicate the source as "Wedoany". If there is any infringement or other issues, please notify us promptly, and we will modify or delete it accordingly. Email: news@wedoany.com