Nvidia Launches Nemotron 3 Ultra, Open Model Targets Cost Reduction for Long-Task Agents

2026-06-02 09:15

Favorite

en.Wedoany.com Reported - On June 1, Nvidia CEO Jensen Huang unveiled the new AI model Nemotron 3 Ultra at COMPUTEX 2026-related events in Taipei, Taiwan, China. The model is designed for enterprise agent workflows, focusing on scenarios such as coding, research, enterprise process automation, and long-duration task execution.

The launch of Nemotron 3 Ultra continues Nvidia's expansion from an AI chip supplier to a combination of "computing platform + model + development tools." Public information shows that Nemotron 3 Ultra is a Mixture of Experts model with 550 billion parameters, designed for long-task agents. It achieves higher inference speed and reduces operational costs in complex agent tasks. For enterprise customers, the cost pressure of agent applications does not only come from single queries, but from continuously calling tools, reading enterprise data, executing multi-step plans, repeatedly verifying results, and long-context reasoning. If a model cannot maintain stability and efficiency in long tasks, it is difficult for enterprises to move agents from internal pilots to production systems. By emphasizing inference speed, cost, and long-task capabilities in Nemotron 3 Ultra, Nvidia is responding to the new demands of enterprise AI as it transitions from "being able to generate content" to "being able to execute processes."

This model belongs to the Nemotron 3 open model family. Nvidia has previously launched models of different scales, including Nano, Super, and Ultra, targeting lightweight deployment, high-throughput inference, and complex agent tasks, respectively.

From a technical path perspective, Nemotron 3 Ultra continues Nvidia's combined strategy centered around open models, NVIDIA NIM, NeMo, CUDA-X, and enterprise software ecosystems. When deploying agents, enterprises typically require models to possess multiple capabilities such as reasoning, code generation, tool calling, process planning, result verification, and security control, while also adapting to private clouds, on-premises data centers, industry software, and enterprise permission systems. Nvidia's advantage lies not only in the model itself but also in its GPU, inference services, software libraries, and developer ecosystem, which can form a unified delivery path. If Nemotron 3 Ultra can integrate with existing AI infrastructure, it will help enterprises embed agent applications into processes such as cybersecurity, operational decision-making, R&D collaboration, customer service, IT automation, and data analysis, reducing the engineering costs of separately adapting different models and inference frameworks.

This launch also echoes Nvidia's expansion in AI PCs, physical AI, and enterprise agent layouts. In the same phase, Huang also introduced new progress on PC chips, agent toolkits, and robot-related models, indicating that Nvidia is extending AI capabilities from data centers to broader scenarios such as personal terminals, enterprise desktops, robots, and autonomous driving. Nemotron 3 Ultra serves as a capability supplement at the enterprise agent and open model level, forming part of Nvidia's infrastructure for the next phase of AI applications alongside hardware chips, inference platforms, and development tools. Subsequent variables will focus on the model's openness, actual inference costs, long-task stability, enterprise software integration speed, and whether developers are willing to build specialized agent applications around the Nemotron ecosystem.

This article is compiled by Wedoany. All AI citations must indicate the source as "Wedoany". If there is any infringement or other issues, please notify us promptly, and we will modify or delete it accordingly. Email: news@wedoany.com

America

Information and Communication Artificial Intelligence Engineering

This bulletin is compiled and reposted from information of global Internet and strategic partners, aiming to provide communication for readers. If there is any infringement or other issues, please inform us in time. We will make modifications or deletions accordingly. Unauthorized reproduction of this article is strictly prohibited. Email: news@wedoany.com