NVIDIA and AWS Collaborate to Achieve 10x Vector Index Acceleration

2026-06-26 09:52

Favorite

en.Wedoany.com Reported - On June 25, 2026, NVIDIA and Amazon Web Services (AWS) recently collaborated to address key constraints in building large-scale AI systems, including low-latency inference, fast vector search, GPU cost-effectiveness, and infrastructure scaling. Through Amazon OpenSearch and Amazon EC2, NVIDIA AI infrastructure provides enterprises with more practical paths for deploying AI in production at scale.

The EC2 G7 instance, powered by NVIDIA RTX PRO 4500 Blackwell Server Edition GPUs, expands the compute layer for AI, graphics, video, and data analytics workloads. The NVIDIA cuVS library accelerates the retrieval layer by making GPU-accelerated vector indexing the default option in OpenSearch Serverless. Additionally, AWS has achieved NVIDIA Exemplar Cloud status on NVIDIA GB300, giving customers confidence in peak optimized performance for training workloads.

The Amazon EC2 G7 instance brings NVIDIA RTX PRO 4500 Blackwell Server Edition GPUs to AWS for AI inference, graphics, spatial computing, and GPU-accelerated data analytics. This is a new instance type designed for production workloads, aiming to deliver performance without the operational overhead of managing GPU platforms for customers. Compared to G6 instances, G7 offers up to 4.6x AI inference performance and up to 2.1x graphics performance. When using the NVIDIA cuDF library for Apache Spark workloads, this instance enables faster GPU-accelerated data analytics on Amazon EMR. The G7 instance supports up to 8 GPUs with a total of 256GB GPU memory, 700 Gbps EFA network connectivity, and up to 7.6TB of local NVMe SSD storage, covering 1, 2, 4, and 8 GPU configurations, as well as upcoming bare metal instances, allowing customers to scale infrastructure based on workload requirements.

The next-generation Amazon OpenSearch Serverless supports agentic AI and dynamic workloads without managing infrastructure. This service uses GPU-accelerated vector indexing powered by NVIDIA cuVS as the default compute choice for all vector collections. For teams building retrieval-augmented generation, semantic search, recommendation systems, and agentic AI applications, this change transforms GPU-driven vector search from a specialized optimization project into a standard AWS capability. Customer impact is reflected in: up to 10x faster vector index building compared to CPU-only configurations, at one-quarter the cost, enabling the construction of billion-scale vector databases in under an hour.

AWS has achieved NVIDIA Exemplar Cloud status on NVIDIA GB300 for training workloads. This means AWS meets the stringent performance thresholds set by NVIDIA for benchmarking AI workloads against its reference architecture. This achievement stems from deep collaborative engineering between the AWS and NVIDIA teams. Through the NVIDIA Exemplar Cloud program, developers and AI leaders can be confident they are using consistent, high-performance cloud infrastructure for large-scale training, helping teams evaluate cloud providers with greater confidence, improve total cost of ownership, and more efficiently move AI projects from planning to production.

This article is compiled by Wedoany. All AI citations must indicate the source as "Wedoany". If there is any infringement or other issues, please notify us promptly, and we will modify or delete it accordingly. Email: news@wedoany.com