HPE Launches 256-GPU Turnkey AI Factory
2026-06-24 11:17
Favorite

en.Wedoany.com Reported - At the HPE Discover conference in Las Vegas, HPE significantly expanded its AI platform. Previously supporting up to 64 GPUs, the platform now scales to 256 GPUs. Customers can start with smaller configurations and expand performance by adding racks later. In addition to ProLiant servers equipped with Nvidia accelerators, the system integrates storage, Data Fabric, and software for models, AI applications, and agents, with Morpheus handling control and OpsRamp managing monitoring.

Installation and integration services are included in the quote. HPE offers the entire environment at a fixed total price, aiming to eliminate the need for enterprises to piece together an AI factory from individual hardware, software, and other components. HPE's Private Cloud AI integrates Nvidia AI Enterprise, curated models and development tools, as well as the Nvidia Agent Toolkit, Nemotron models, NemoClaw, and OpenShell. Agents can register, deploy, and set access rules through these tools. On the compute side, the product is supplemented by systems equipped with Nvidia RTX Pro 6000 Blackwell Server Edition.

Additionally, HPE released the ProLiant DL394 Gen12, powered by Nvidia's Arm-based Vera CPU. This CPU handles memory and controller-intensive parts of agent applications and works closely with Nvidia GPUs. Consequently, HPE's Private Cloud AI is primarily built around Nvidia's hardware and software stack. This tight integration reduces integration effort but also limits flexibility in choosing accelerators and runtime environments.

The Alletra Storage MP X10000 plays a central role in the new AI platform. It provides file and object storage on a unified architecture and is directly integrated into Private Cloud AI. HPE also uses it as extended storage for performance-related KV caches. Language models store information about processed text, context, and intermediate results in the KV cache. When processing further requests, the model falls back on this context rather than recalculating each time.

This is particularly important for long prompts, large volumes of documents, and multiple parallel agents. The longer the context and the more concurrent requests, the faster storage requirements grow. If old context information is cleared, the model must recalculate for subsequent requests, increasing latency, energy consumption, and costs. In agent environments, the issue is more pronounced because agents do not just respond once but repeatedly verify, plan, retrieve data, and prepare actions.

Therefore, HPE offloads part of the KV cache to the X10000 via Remote Direct Memory Access. In this process, data is transferred directly between storage and memory, bypassing multiple processing layers of the operating system. This allows the storage unit to take on part of the GPU memory and become part of the inference process. According to HPE, in its test configuration using Nvidia H200 GPUs and the Nemotron 70B model, the time to first token was reduced to one-twentieth, while throughput increased by a factor of 17.

The new Data Fabric 8.2 can capture and catalog distributed data resources. A global catalog shows what information exists and where it is located. Metadata, identity, and access policies determine which applications or agents can access specific resources. Data Fabric is also available as a pre-configured appliance on ProLiant servers. Across the entire technology stack, the X10000 handles fast data access, while Data Fabric makes data resources discoverable and controllably available.

However, technical data organization alone does not make data suitable for AI. For training and agents, data must first be classified, cleaned, described, and have permissions set. Despite automation tools, this process still requires manual effort in part. For example, business units must explain the meaning, timeliness, and purpose of certain data.

To operate the integrated AI environment, HPE relies on Morpheus, OpsRamp, and GreenLake Intelligence. Morpheus provides compute, storage, and runtime resources and orchestrates private cloud infrastructure. OpsRamp collects telemetry data and monitors dependencies between applications, models, and underlying infrastructure. Currently, these operational functions are more tightly connected with AI-driven automation. Morpheus Central aims to display multiple installation instances across data centers, regions, and edge sites.

This is important for AI environments because models, data, and inference services often do not run in a single location. OpsRamp not only collects faults but also correlates them and identifies root causes in the infrastructure. HPE has extended this layer with Copilot features and an MCP interface. Morpheus Copilot can create blueprints and automation based on natural language instructions. OpsRamp Copilot is intended to analyze events and support remediation measures. The MCP server provides a standardized interface through which agents can access management and automation functions. GreenLake Intelligence integrates these capabilities into a unified control plane.

HPE has supplemented the technology stack with controls for AI agents. These agents have their own identities and run in isolated environments. Policies dictate which data, interfaces, and tools they can use. For critical operations, human approval can be required. Zerto provides an additional fallback layer; the software records changes and, if necessary, restores affected systems to a previous state. However, it cannot determine whether a decision is technically erroneous or regulatory non-compliant.

For HPE, governance primarily refers to technical access control and policy enforcement. Business model validation, bias detection, regulatory classification, and accountability assignment remain tasks outside the platform. This is a weak point in many private AI strategies. While owning infrastructure increases control over data and operations, it does not replace governance. IBM and Red Hat recently noted that many enterprises do not fully understand their dependencies on AI vendors, models, and infrastructure. Private clouds can make these dependencies more transparent but cannot eliminate them.

AI factory products on the market differ significantly. For example, Dell uses disaggregated infrastructure where compute and storage can scale independently. In contrast, HPE bundles hardware, data platform, and operations software more tightly into a defined overall system. This shifts integration work from the customer to the manufacturer but reduces flexibility, particularly increasing dependence on Nvidia, as HPE integrates not only GPUs but also the partner's CPUs, models, runtime environments, and agent tools. The advantage is well-coordinated components; the disadvantage is that replacing a single component will more quickly hit architectural limits.

HPE positioning its AI product as a private cloud aligns with current trends. As AI moves from pilot projects to production operations, the way infrastructure is accounted for is also changing: public cloud services remain attractive for testing, flexible workloads, and rapid access to new models; for persistent inference, agent workflows, and sensitive data, cost control and data access become priorities. Deloitte sees an economic tipping point under sustained high AI loads: when cloud costs reach a significant portion of comparable owned system costs, private cloud options may be cheaper. Forrester also expects enterprises to adopt more private AI clouds due to rising AI costs, data lock-in, and operational risks. HPE's Private Cloud AI is not positioned as an alternative to public cloud but as an operational platform for AI workloads that must be closer to data and processes. Notably, the full technology stack is not yet completely ready; some announced features and integrations will only become available in the coming quarters.

This article is compiled by Wedoany. All AI citations must indicate the source as "Wedoany". If there is any infringement or other issues, please notify us promptly, and we will modify or delete it accordingly. Email: news@wedoany.com