Xiaomi's AI Model MiMo-V2.5 Series Enters Public Beta; Pro Version Performance Rivals GPT-5.4 and is Open-Sourced

2026-04-23 09:53

Keywords:

Favorite

en.Wedoany.com Reported - On April 23rd, Xiaomi announced the official launch of the public beta for its AI model series, Xiaomi MiMo-V2.5. The series comprises four AI models: MiMo-V2.5, MiMo-V2.5-Pro, MiMo-V2.5-TTS Series, and MiMo-V2.5-ASR, covering three major modalities: text dialogue, speech synthesis, and speech recognition. Among them, the flagship model MiMo-V2.5-Pro and the general-purpose model MiMo-V2.5 will be open-sourced globally. Developers can access APIs through the MiMo Open Platform or experience the models in MiMo Studio.

Positioned as "born for long and difficult Agent tasks," MiMo-V2.5-Pro supports a context length of 1 million tokens. It rivals top-tier global AI models such as Claude Opus 4.6 and GPT-5.4 in dimensions like general agent capabilities, complex software engineering, and long-range tasks. Internal testing shows that, when paired with its runtime framework, this AI model can stably complete long-cycle tasks involving nearly a thousand tool calls in a single session, with significant improvements in complex instruction parsing and cross-step logical consistency. In a practical case, Peking University's "Compiler Principles" course project required students to implement a complete SysY compiler in Rust. While undergraduate students typically take several weeks, MiMo-V2.5-Pro completed the entire development in just 4.3 hours through 672 tool calls, achieving a perfect score of 233 on the hidden test set. In another case, based on the brief instruction "build a video editor web application," the AI model delivered a functional application with features like multi-track timeline, clip trimming, crossfade, and audio mixing. The codebase amounted to 8,192 lines, autonomously completed through 1,868 tool calls over 11.5 hours.

MiMo-V2.5 focuses on native full-modal Agent capabilities, fully supporting image, audio, and video inputs with faster inference speed, also supporting a 1 million token context. In mainstream Agent evaluations like Claw-Eval, its overall performance surpasses the previous generation MiMo-V2-Pro, with API costs reduced by approximately 50%. Its capabilities in cross-modal reasoning, video understanding, and chart analysis approach or even exceed top-tier closed-source AI models in evaluations such as VideoMME, CharXiv, and MMMU-Pro. For speech synthesis, the V2.5-TTS Series, based on a self-developed Audio Tokenizer and a multi-codebook speech-text joint modeling architecture, has undergone large-scale pre-training with hundreds of millions of hours of speech data and multi-dimensional reinforcement learning, achieving highly controllable multi-granularity speech style control.

Token efficiency optimization is another core highlight of this upgrade. Under the same ClawEval benchmark score, MiMo-V2.5-Pro saves 42% in token consumption compared to Kimi K2.6, while MiMo-V2.5 saves 50% compared to Muse Spark. The Token Plan pricing scheme has been adjusted accordingly: the previous billing method of 1 Token = 4 Credits is abolished, and the Credit multiplier distinction between 256K and 1M context windows is removed. New continuous monthly and annual subscription models have been added. From 00:00 to 08:00 Beijing Time daily, the Credit consumption rate for all AI models receives an additional 20% discount on the base rate. The Token Plan offers four monthly tiers: the Lite package at 39 RMB/month provides 60 million Credits, while the highest tier at 659 RMB/month provides 1.6 billion Credits.

Luo Fuli, Head of Xiaomi's Large Model division, previously stated at the 2026 Zhongguancun Forum that the team's first full-stack AI product designed for the Agent era was released more like a "silent ambush" due to the rapid pace of technological iteration. She emphasized that open-sourcing AI models must meet the condition of being "stable enough and worthy of open-sourcing" to ensure a good developer experience. She also revealed that the maturity of frameworks like OpenClaw has raised the ceiling for some models approaching closed-source AI capabilities, making Agent capability a key metric for assessing the practicality of large models. Xiaomi first open-sourced its inference-optimized AI model, Xiaomi MiMo, in April 2025, released and open-sourced the upgraded version MiMo-V2-Flash in December of the same year, and launched the flagship foundational AI model for the Agent era, MiMo-V2-Pro, in March 2026. The release of the V2.5 series continues Xiaomi's technical accumulation and product cadence in the field of open-source Agent large models.

This article is compiled by Wedoany. All AI citations must indicate the source as "Wedoany". If there is any infringement or other issues, please notify us promptly, and we will modify or delete it accordingly. Email: news@wedoany.com