Japan's SoftBank to Launch AI Data Center GPU Cloud in October

2026-05-26 16:52

Favorite

en.Wedoany.com Reported - On May 25, Japan's SoftBank announced that it will launch its "AI Data Center GPU Cloud" service in October 2026. The service is driven by the "Infrinia AI Cloud OS" software stack and is part of SoftBank's new cloud business. Targeting workloads such as AI model development, inference, and data processing, it provides integrated AI computing infrastructure and software capabilities that can be used securely within Japan.

The direction of this service is to advance AI computing power from simple GPU resource rental to a combined delivery of "computing infrastructure + AI data center software stack." When deploying large models and industry-specific AI applications, enterprises often need more than just GPUs; they also require multi-tenant resource management, container orchestration, inference APIs, storage, networking, security, and operational automation. The core of SoftBank's GPU cloud launch is to integrate underlying GPU computing power, Kubernetes environments, and model inference services through Infrinia AI Cloud OS, reducing the complexity for enterprises to build their own AI development and runtime environments.

The service will combine AI computing infrastructure within SoftBank's data centers in Japan, including GPU acceleration platforms such as the NVIDIA GB200 NVL72. SoftBank stated that customers can execute various AI workloads on this platform, from model training and inference to data processing, while completing data management and operations within Japan. For clients in finance, manufacturing, telecommunications, public services, and large enterprises, a localized AI cloud helps balance computing power access, data security, low latency, and business continuity.

Infrinia AI Cloud OS is the key software foundation of this release. The software stack supports Kubernetes-as-a-Service for multi-tenant environments and Inference-as-a-Service for large language model inference APIs. By automating the deployment and operation of inference infrastructure, enterprises can build model inference environments faster, without needing to start from scratch with underlying clusters, containers, service orchestration, and resource scheduling. SoftBank indicated that this approach helps reduce total cost of ownership and operational burden while improving the delivery efficiency of GPU cloud services.

Competition in AI data centers is shifting from hardware procurement to system operational efficiency. The NVIDIA GB200 NVL72 represents a new generation of high-performance AI computing platforms, but whether its value can be realized depends on whether the cloud platform can stably manage large-scale GPU resources, handle multi-tenant isolation, support mixed training and inference workloads, and scale rapidly as business needs change. SoftBank's bundling of its GPU cloud with Kubernetes, inference services, and unified operational capabilities indicates that AI infrastructure service providers are competing around "usable computing power, manageable computing power, and deliverable computing power."

SoftBank is also placing this service within its "Telco AI Cloud" roadmap. The company plans to leverage its own telecommunications infrastructure to combine the AI Data Center GPU Cloud with AI-RAN, building a sovereign, distributed AI infrastructure for the AI era that offers low latency and high reliability. For telecom operators, future AI infrastructure may integrate more deeply with communication networks, edge nodes, data centers, and radio access networks, where cloud training, edge inference, and intelligent network scheduling become different links within the same system.

Subsequent project milestones include feedback from beta usage, preparations for the official October launch, enterprise customer onboarding, the operational performance of the NVIDIA GB200 NVL72 clusters, and the subsequent integration of this service with the AI-RAN and Telco AI Cloud roadmap. What can be confirmed at this stage is that SoftBank has announced it will launch the AI Data Center GPU Cloud in October 2026, and will provide a beta version for internal group use starting May 25; publicly available information does not disclose customer lists, pricing systems, GPU cluster scale, specific data center locations, or contract amounts, so it should not be extrapolated to suggest confirmed commercial revenue or large-scale customer orders.

This article is compiled by Wedoany. All AI citations must indicate the source as "Wedoany". If there is any infringement or other issues, please notify us promptly, and we will modify or delete it accordingly. Email: news@wedoany.com

Japan

This bulletin is compiled and reposted from information of global Internet and strategic partners, aiming to provide communication for readers. If there is any infringement or other issues, please inform us in time. We will make modifications or deletions accordingly. Unauthorized reproduction of this article is strictly prohibited. Email: news@wedoany.com

Previous：Cloud Computing Electricity Load in Zhongwei, Ningxia, China, Surpasses 300,000 Kilowatts, With Computing Power Scale Reaching 231,300 P

Next：SK Hynix of South Korea Introduces iHBM Thermal Solution for Next-Generation AI Memory