US-based Tensordyne's Napier 13x Faster than GB300, Orders Exceed $200 Million

2026-06-16 09:45

Favorite

en.Wedoany.com Reported - Tensordyne (formerly Recogni, founded in 2017) announced that its AI accelerator "Napier" has completed tape-out. The chip is named after logarithm inventor John Napier, and its core innovation lies in using logarithmic mathematics to convert the numerous multiplications in AI model operations into additions.

Since addition operations are more efficient, Tensordyne claims that the computing performance of a single Napier rack far exceeds that of AI servers using NVIDIA's GB300 technology. Depending on the AI model, a single Napier rack can process up to 13 times more tokens per second than NVIDIA's GB300 NVL72. In terms of efficiency measured by tokens per second per watt, the improvement could even reach 17 times.

The company revealed that the Napier system has received orders totaling over $200 million, but has not yet announced the specific delivery timeline for the first TDN72 Pod. NVIDIA plans to launch its inference-optimized Groq 3 LPX system by the end of 2026, while the previously announced Rubin CPX project, also targeting inference, appears to have been shelved.

Thanks to the logarithmic computing approach, Napier's actual compute units can be designed smaller, allowing more cores to be integrated on the chip and accommodating high-speed SRAM. Each Napier chip is equipped with 144 GB of HBM3E memory and integrates ultra-high-speed interconnects. A TDN72 Pod consists of four tightly coupled rack slots, each containing nine Napier chips. A complete Tensordyne Napier rack comprises four TDN72 Pods, integrating a total of 288 Napier chips.

A single TDN rack delivers 608 PFlops of computing power, equipped with 42 TB of HBM3E, 78 GB of SRAM, and 256 TB of RAM. Its full-load power consumption is 120 kW, supports air cooling, and the internal interconnect transfer rate within the rack reaches 275 TB/s.

The Napier chip can handle data formats such as FP16, FP8, FP4, and Int8. According to Tensordyne, the chip is suitable for mainstream AI models including Kimi K2.6, DeepSeek-R1/V4 Pro, Llama3.1 405B, Mixtral 8x22B, GPT-OSS-120B, and Qwen 80B. For comparison, NVIDIA plans to install 256 Groq-3-LPUs in a single Groq-3-LPX rack, with each LPU equipped with 500 MB of SRAM, totaling 128 GB of SRAM and 12 TB of DDR5 RAM per rack.

Tensordyne is headquartered in Silicon Valley, with a branch office in Munich. Several senior developers previously worked at Juniper Networks (now part of HPE). During the development of the Napier chip, Tensordyne collaborated with Broadcom, which also develops AI chips for Google's multiple generations of TPUs.

This article is compiled by Wedoany. All AI citations must indicate the source as "Wedoany". If there is any infringement or other issues, please notify us promptly, and we will modify or delete it accordingly. Email: news@wedoany.com

America

Germany

This bulletin is compiled and reposted from information of global Internet and strategic partners, aiming to provide communication for readers. If there is any infringement or other issues, please inform us in time. We will make modifications or deletions accordingly. Unauthorized reproduction of this article is strictly prohibited. Email: news@wedoany.com

Previous：TTP Develops 5G NTN Modem Module for Ku/Ka Bands

Next：US-based PicoJool Launches 200G VCSEL Products for AI Optical Interconnects