On June 10, local time, Amazon Web Services (AWS) officially announced the launch of its Amazon EC2 M9g and M9gd instances, powered by the fifth-generation self-developed Arm processor—Graviton5—released last December. This is not just a routine hardware iteration but is widely regarded by the industry as a key move by AWS in anticipation of the upcoming "Agentic AI" era.
3nm Process, 192 Cores
With the explosion of Agentic AI, AI workloads are undergoing a fundamental shift: from simple "text-based Q&A" to "autonomous action"—real-time reasoning, code generation, multi-step task orchestration, and cross-system tool invocation. These tasks impose unprecedented demands for high concurrency and low latency on the Central Processing Unit (CPU) responsible for logic control and scheduling. The AWS Graviton5, designed with large cores, large caches, and high memory bandwidth, is specifically built to meet the needs of "Agentic AI."
The Graviton5 adopts TSMC's 3nm process technology, packaging more transistors at the same power consumption, achieving higher circuit density and energy efficiency.
The Neoverse V3 core in the Graviton5 is co-defined by Arm and AWS Annapurna Labs. While its L1 cache (64 KB) and L2 cache (2 MB) are not the biggest highlights, the L3 cache has been increased fivefold to 192 MB, keeping massive amounts of hot data closer to the cores. Additionally, its branch prediction capability has been significantly enhanced, delivering up to a 30% performance improvement when running complex code like real-world databases—far beyond what small-loop benchmark tests can achieve.
In terms of core count, the Graviton5 has jumped from 96 cores in the Graviton4 to 192 cores, a 100% increase. But more importantly, AWS has abandoned the previous single-core chip (Die) architecture in favor of an advanced 4-chiplet design. This means the 192 cores are evenly distributed across four independent chiplets, each containing 48 cores and integrating dedicated DRAM memory controllers and PCIe 6.0 I/O controllers.
This design offers two major advantages: first, data no longer needs to travel long distances across the entire chip to access memory or I/O devices, significantly reducing latency; second, through custom inter-chip interconnect technology, the four chiplets can provide up to 420 GB/s of bandwidth, ensuring efficient coordination across the entire compute grid.
More critically, the Graviton5 becomes the first cloud processor to support DDR5-8800 memory and PCIe Gen 6. AWS emphasizes that, through close collaboration with DRAM manufacturers, the Graviton5 offers the fastest memory speed among all current cloud processors. For memory bandwidth-sensitive applications (such as large databases and real-time analytics), this means a significant easing of bottlenecks.
Additionally, the Graviton5 adopts a lidless design (removing the CPU metal heat spreader to allow the bare die to directly contact the cooling device), reducing cooling fan power consumption by 33%.
AWS previously stated in an announcement that the Graviton5 is "our most powerful and energy efficient custom-designed chip yet." While this is limited to AWS's self-developed chips, given AWS's market position in Arm server chips, this statement carries considerable reference value.
Cross-Scenario Dominance of the M9g Instance
The hardware improvements of the Graviton5 ultimately translate into instance performance gains. As the debut carrier for the Graviton5, the M9g instance delivers a convincing performance across multiple dimensions.
According to official AWS data, compared to the previous generation M8g instance based on the Graviton4, the M9g instance offers: 25% improvement in general-purpose computing performance; 35% improvement in web application performance; 35% improvement in machine learning inference performance; 30% improvement in database performance.
During a multi-month preview period, several industry-leading customers validated these numbers with real production environments:
ClickHouse: Achieved a 36% performance improvement with zero code changes.
Honeycomb: In a 6-month A/B test on production observability workloads, throughput per core increased by 36%.
HubSpot: After migrating its MySQL database to M9g, query latency dropped by up to 60%.
Meta: Has committed to deploying tens of millions of Graviton cores for its Agentic AI projects, becoming one of the largest Graviton customers globally.
For workloads requiring local high-speed storage, AWS also launched the M9gd instance, offering up to 11.4 TB of NVMe SSD with 30% higher IOPS than the previous generation. In terms of networking, the maximum instance network bandwidth has been increased to 100 Gbps, EBS bandwidth to 72 Gbps, and it supports Instance Bandwidth Configuration (IBC), allowing users to dynamically allocate up to 25% of bandwidth between the VPC network and EBS storage to suit different I/O-sensitive tasks.
First Integration of the Nitro Isolation Engine
Beyond performance, the Graviton5 introduces another milestone in AWS's security architecture for the first time—the Nitro Isolation Engine.
Traditional virtualization isolation relies on a series of software and hardware checks and tests, which theoretically may have undiscovered vulnerabilities. The Nitro Isolation Engine uses formal verification technology, a method that uses mathematical logic to prove that hardware or software behavior fully meets expectations (rather than just passing specific test cases). As a dedicated component, this engine strictly controls access to all virtual machine memory, CPU register states, and I/O devices through a minimal set of APIs.
This makes AWS Nitro the first formally verified cloud hypervisor. It is no longer "we believe it is secure" but "mathematically proven to be secure." For financial, government, and security-sensitive workloads, this provides an unprecedented, mathematical level of isolation assurance.
Launch and Availability
Currently, M9g and M9gd instances are officially available in the AWS US East (Northern Virginia, Ohio), US West (Oregon), and Europe (Frankfurt) regions. Customers can purchase them via On-Demand, Reserved Instances, or Savings Plans. To help customers migrate smoothly, AWS also offers the Graviton Quick Start Guide, a Cost Savings Dashboard, and the AI-driven code conversion service AWS Transform, which can automatically migrate Java applications from x86 architecture to Graviton instances.
According to AWS data, over 120,000 customers are currently using Graviton processors, supporting more than 350 instance types, covering everything from web applications, microservices, and containers to Electronic Design Automation (EDA), gaming, and video encoding. The chip business, with an annual revenue exceeding $20 billion and maintaining triple-digit growth, proves that AWS's self-developed chip strategy is not just a technical exploration but has become a core profit and differentiation engine for its cloud business.
The launch of the Graviton5 marks another decisive step for AWS on the path of self-developed cloud chips. It is no longer content with catching up to x86 architecture in cost-performance ratio but, through forward-looking optimization for Agentic AI workloads, aims to define the computing foundation for the next generation of cloud computing.
