China's Unisound Releases U2 Large Model, Scores 87.9 on GPQA Diamond
2026-06-08 13:37
Favorite

en.Wedoany.com Reported - Unisound released its next-generation general-purpose large language model, U2, on June 8, 2026. Positioned as a native Agent large model for individuals, developers, and organizations, its technical philosophy emphasizes high intelligence density and high token value, avoiding the blind accumulation of parameters or output length.

Unlike traditional general language models that favor single-turn Q&A, U2 emphasizes continuous execution of real-world tasks. In scenarios such as complex office work, software engineering, deep research, and multi-tool collaboration, U2 can autonomously decompose and advance workflows exceeding 100 steps, connecting demand understanding, task planning, environment interaction, tool usage, process correction, and result verification into an execution loop, shifting from providing answers to completing tasks.

image1

In evaluations, U2 scored 87.9 on GPQA Diamond, which measures knowledge and complex reasoning capabilities, surpassing GLM-5.1, Hy3 preview, DeepSeek-V4-Flash (High), and MiniMax M2.7. It scored 75 on SWE-Bench Verified, which assesses software engineering capabilities, ranking among the top mainstream models. On Claw-Eval (pass@3), an end-to-end evaluation for autonomous Agent execution, it scored 76.9, also surpassing Hy3 preview, DeepSeek-V4-Flash (High), and MiniMax M2.7. On GDPval, which evaluates office and knowledge work delivery capabilities, it scored 72.9, a benchmark focusing on the completion of typical office tasks such as document analysis, report writing, spreadsheet processing, chart generation, and slide creation.

Unisound stated that U2's design does not rely on winning through a single isolated capability but provides systematic performance in reasoning, programming, Agent execution, and office delivery. To achieve task execution goals, U2 introduces a hybrid thinking mechanism that dynamically switches between explicit chain-of-thought and latent space reasoning within the same reasoning process based on task stage, complexity, and uncertainty. In the initial task phase, the model performs path searching, task decomposition, and candidate solution generation in latent space; during critical judgment or constraint handling phases, it switches to explicit reasoning for logical calibration and result convergence. Through bounded latent deduction and entropy-aware switching, the model dynamically adjusts its thinking mode based on uncertainty in the reasoning process.

In terms of knowledge foundation, U2 applies high-knowledge-density data filtering and purification techniques to filter out duplicate and low-quality data, combined with sparse knowledge encoding and knowledge distillation architectures to compress redundant model parameters. At the task execution layer, it introduces the Agent-Harness collaborative training paradigm, integrating model capability enhancement and toolchain optimization into the same training loop, allowing high-quality execution trajectories generated in real tasks to feed back into the model, enhancing its capabilities in planning, tool usage, process correction, and result acceptance.

U2 focuses on three core capabilities: reasoning, programming, and Agent. Reasoning emphasizes low-bias execution and long-term logical stability; programming targets end-to-end engineering delivery, generating code from natural language requirements and understanding multi-file project structures; Agent capabilities aim to improve multi-tool collaboration, long-flow orchestration, and environment interaction. These capabilities form a closed loop for task delivery from demand understanding, planning execution, to collaborative verification.

In terms of application scenarios, U2 can cover full-spectrum interface design, including responsive web development, mobile web application construction, and design system implementation; deep research and analysis, including industry and policy research, data visualization analysis, and multi-format document delivery; immersive interactive game development, such as classic casual games and physics simulators; and efficient office automation, including business report analysis, industry landscape analysis, and periodic business reviews. U2 has been launched on Unisound's Token Hub, open to individuals, developers, and organizations.

This article is compiled by Wedoany. All AI citations must indicate the source as "Wedoany". If there is any infringement or other issues, please notify us promptly, and we will modify or delete it accordingly. Email: news@wedoany.com