Beijing's First Token Factory Commences Operations, with Daily Capacity Reaching 1.4 Trillion Tokens
2026-06-15 17:25
Favorite

en.Wedoany.com Reported - Beijing's first token factory—Beijing No. 1 Token Factory—has officially been established in the Beijing Economic-Technological Development Area's Information Technology Application Innovation Industrial Park. Built by iSoftStone Information Technology Co., Ltd., the first phase of the project can achieve a daily token production capacity of 1.4 trillion.

A token is the smallest unit for processing text in AI, and computing power determines the number of tokens that can be processed per second and the associated cost. The factory consists of a series of servers, aiming to transform computing power into stable and affordable public resources, supporting the evolution of large language models from simple dialogues to long-running systems.

Beijing No. 1 Token Factory focuses on agent service scenarios. By leveraging extreme engineering methods to maximize hardware performance, integrating cutting-edge computing power scheduling and KV Cache reuse algorithms, it guarantees a service availability of no less than 99.9%, a P90 first-token latency of less than 10 seconds with fluctuation under 20%, and a cache hit rate of no less than 90%. The factory operates 24/7, with half of core response tasks completed within 6 seconds, 90% of tasks responding in under 10 seconds, and fluctuation controlled within 20%.

iSoftStone has simultaneously open-sourced a global token factory performance benchmark, including the evaluation framework LoadGen 2.0. This benchmark is based on a deep restructuring of the industry-standard MLPerf LoadGen, achieving a leap from static concurrent injection to dynamic behavior simulation, thereby defining and reproducing real-world chaotic scenarios in a test environment. The benchmark employs a three-tier progressive evaluation system (the underlying chaotic load characterization method, the mid-level three standard test methods—rated power/business/accuracy correctness, and the upper-level standard datasets for different domains) to assess and compare the actual service capabilities of computing clusters. LoadGen 2.0 is fully open-sourced.

In the next phase, Beijing No. 1 Token Factory will collaborate with green energy bases in Zhangjiakou, Ulanqab, and other locations to build a Beijing-Tianjin-Hebei integrated computing cluster, with a long-term goal of achieving a daily production of 10 trillion tokens. Industry insiders believe that the project's implementation fills the gap in high-end large-scale computing power supply in China, establishes an industry benchmark for computing power services and evaluation, and will further attract AI upstream and downstream enterprises to gather, continuously improving the regional artificial intelligence industry chain.

This article is compiled by Wedoany. All AI citations must indicate the source as "Wedoany". If there is any infringement or other issues, please notify us promptly, and we will modify or delete it accordingly. Email: news@wedoany.com