en.Wedoany.com Reported - ByteDance engineer Fengnan Chang has designed an algorithm for the Linux kernel that simplifies the direct I/O processing path to address performance bottlenecks in PCIe Gen5 NVMe SSDs during 4KB random reads. The patch has been merged into the Git repository of the VFS subsystem and is expected to be officially released with Linux version 7.3 by the end of this year.
During 4KB random reads on PCIe Gen5 NVMe SSDs, engineers identified that the root cause of throughput limitations lies in the operating system layer. The Linux kernel consumes excessive CPU resources when processing each small block request, with the overhead of the IOmap subsystem being particularly prominent. IOmap is responsible for mapping file logical addresses to disk physical blocks during direct I/O, but allocating memory for auxiliary structures and maintaining complex state machines consumes significant computing resources, becoming the primary bottleneck limiting high throughput performance.
Fengnan Chang's simplified direct I/O path (simple dio path) reduces latency by removing resource-intensive operations. This mechanism requires requests to simultaneously meet four conditions: the operation type is only reads; the amount of data read does not exceed the file system block size (typically 4KB); the target file is not encrypted; and the file system is EXT4 or XFS. Eligible requests bypass the state machine and dynamic memory allocation phases of the IOmap subsystem, being sent directly to the bottom layer of the Linux kernel I/O stack via the shortest path.
Tests conducted in conjunction with the io_uring subsystem show that during 4KB random reads on EXT4 and XFS, performance increased from 1.92 million IOPS to 2.19 million IOPS, an improvement of approximately 14%. Currently, this patch, known as "IOmap Simple DIO," has passed review and been merged into the vfs-7.3.iomap branch of the VFS subsystem's Git repository. The code will be submitted to Linus Torvalds' mainline branch for inclusion in the Linux 7.3 merge window.










