SanDisk Launches HBF High-Bandwidth Flash to Address AI Memory Bottlenecks
2026-06-15 14:48
Favorite

en.Wedoany.com Reported - SanDisk has unveiled High Bandwidth Flash (HBF) technology, designed to address memory bottlenecks in AI inference workloads.

High Bandwidth Flash

AI computing is driving a transformation in data center memory architecture. Currently, approximately one in seven data centers is capable of handling AI workloads, a figure expected to approach 70% by 2030. AI is migrating from hyperscale data centers to enterprise data centers and edge networks, with edge AI applications projected to generate nearly $66.5 billion in revenue by the end of this decade. These vast content repositories are placing pressure on traditional storage architectures and exposing inherent architectural weaknesses.

DRAM and dedicated High Bandwidth Memory (HBM), widely used in data centers, are increasingly struggling to keep pace with the demands of large AI models in terms of density, storage capacity, and scalability. Hyperscale computing manufacturers are facing rising costs in DRAM and HBM production, design complexity, and energy consumption. This challenge is even more pronounced in enterprise data centers and edge AI applications, where physical space is limited and higher memory costs and power consumption are difficult to accommodate. AI inference, currently the dominant workload, has data management requirements distinct from AI training, necessitating the storage of large and continuously growing AI models. Memory solutions based on HBM and DRAM have proven inadequate in terms of capacity and cost scalability.

DRAM capacity scaling has largely stalled, while the demand for higher capacity in AI inference continues to grow. DRAM's advantages in low latency and random access do not align with AI inference, as inference access patterns are deterministic and more tolerant of latency through techniques such as data prefetching. These shortcomings exist within the $120 billion DRAM industry, which is facing massive spending on AI infrastructure from hyperscale providers (potentially reaching $6.7 trillion by the end of this decade).

SanDisk's proposed HBF solution is a new memory architecture specifically designed to drive next-generation AI computing. HBF aims to meet the requirements of advanced computing and data-intensive applications in terms of capacity, energy efficiency, throughput, and scalability. Compared to HBM, HBF offers higher capacity and memory density, with bandwidth comparable to HBM, and is better aligned with AI inference trends. As a persistent storage medium, HBF retains data even when power is lost and features thermal stability to support higher operating temperatures. This technology leverages SanDisk's BiCS design and manufacturing techniques as well as chip architecture, redefining NAND flash by optimizing high bandwidth and inference memory characteristics. The BiCS CMOS Bonded Array Wafer (CBA) technology is used to enhance energy efficiency and bandwidth.

Compared to traditional NAND flash, HBF achieves lower latency and significantly higher read bandwidth by leveraging parallelism, advanced logic scaling, and custom stacking technologies. This enables large language models to transfer data at speeds approaching those of DRAM. Additionally, HBF supports large KV caches to efficiently handle long and complex user prompts, as well as customer and domain-specific data, thereby improving AI inference accuracy.

Since HBM is typically unsuitable for edge and mobile environments due to density, cost, and power constraints, HBF can provide edge devices (such as smartphones) with greater memory capacity to handle more complex AI inference tasks. With its persistent memory, HBF supports seamless retrieval of old context from previous queries to solve new problems. In enterprise computing, where user scales are far smaller than those in hyperscale data centers, the cost of large GPU clusters supported by HBM is prohibitive. By adopting HBF-enabled accelerators, small enterprises can fine-tune large pre-trained models for specific domains.

Compared to HBM, HBF offers a clear capacity advantage while delivering the high throughput required for AI inference applications. As a scalable new system memory technology, HBF helps reduce performance bottlenecks and accelerate time-to-insight for AI applications in modern data centers and edge networks.

This article is compiled by Wedoany. All AI citations must indicate the source as "Wedoany". If there is any infringement or other issues, please notify us promptly, and we will modify or delete it accordingly. Email: news@wedoany.com