Moore Threads S5000 Completes Day-0 Adaptation for MiniMax M2.7 Large Model, China's Full-Function GPU Demonstrates Rapid Response

2026-04-13 10:01

Favorite

en.Wedoany.com Reported - On April 12, 2026, Moore Threads' AI training and inference integrated full-function GPU MTT S5000 completed Day-0 ultra-fast adaptation for the new-generation large model MiniMax M2.7, once again validating the rapid response and stable support capabilities of China's full-function GPUs for cutting-edge AI large models. M2.7 is the industry's first large model with deep self-evolution capabilities. Leveraging 80GB of video memory, 1.6TB/s high bandwidth, and the PD separation architecture, combined with efficient KV Cache management, the S5000 supports the stable execution of MiniMax M2.7's long-duration, multi-step tasks.

MiniMax M2.7 was officially released by MiniMax on March 18, 2026. By constructing an Agent Harness (Intelligent Agent Execution Framework) system, the model deeply participates in its own training and optimization processes. It can undertake 30% to 50% of the workload in certain R&D scenarios and has achieved approximately a 30% performance improvement on internal evaluation datasets. In internal testing, the model can continuously execute over 100 rounds of "analysis-improvement-verification" cycles, autonomously adjusting sampling parameters and optimizing workflow strategies. M2.7 can independently construct complex Agent Harness systems and complete highly complex productivity tasks based on capabilities such as Agent Teams, complex Skills, and Tool Search Tool. On April 12, MiniMax announced the global open-sourcing of M2.7. On the first day of its open-source release, chip manufacturers from China and abroad, including Huawei Ascend, Moore Threads, MetaX, Kunlunxin, and NVIDIA, completed model integration and inference adaptation.

The MTT S5000 is a full-function GPU intelligent computing card designed by Moore Threads specifically for large model training, inference, and high-performance computing. It is built on the fourth-generation MUSA architecture, "Pinghu." Its single-card AI computing power can reach up to 1000 TFLOPS, equipped with 80GB of video memory, a memory bandwidth of 1.6TB/s, and an inter-card interconnect bandwidth of 784GB/s. It fully supports full-precision computing from FP8 to FP64, making it one of China's earliest training GPUs to natively support FP8 precision. Relying on the MUSA full-stack software platform, the MTT S5000 natively adapts to mainstream frameworks such as PyTorch, Megatron-LM, vLLM, and SGLang. Moore Threads has repeatedly achieved Day-0 instant adaptation for large models, previously covering Chinese models like Zhipu GLM-5, Qwen QwQ-32B, and MiniMax M2.5.

The self-evolution capability of MiniMax M2.7 places higher demands on the hardware for large model deployment. During its self-iteration process, the model needs to continuously execute long-duration, multi-step tasks, requiring the hardware system to balance video memory capacity, bandwidth throughput, and computational efficiency. The Moore Threads S5000, with its PD separation architecture, enables independent scheduling of computing and data processing tasks. Its efficient KV Cache management mechanism optimizes long-sequence processing scenarios, ensuring stable performance during continuous operation. This adaptation marks a closer stage of collaborative evolution between Chinese GPUs and cutting-edge large models.

This article is compiled by Wedoany. All AI citations must indicate the source as "Wedoany". If there is any infringement or other issues, please notify us promptly, and we will modify or delete it accordingly. Email: news@wedoany.com