CITIC Securities: Token Surge Leads to Computing Power Shortage; Domestic Computing Power Chip Shipments Expected to Double by 2026
2026-04-16 10:14
Favorite

en.Wedoany.com Reported - A report released by CITIC Securities Research Department on April 16, 2026, indicates that the explosion of AI applications such as Agents and multimodal models has driven a surge in Token calls, leading to a computing power shortage in China. The active adaptation of Chinese large models on the inference side has brought accelerated growth opportunities for domestic computing power manufacturers. CITIC Securities predicts that China's computing power chip shipments will at least double in 2026, providing strong growth momentum for computing power design companies, advanced manufacturing processes, advanced packaging, advanced storage, and the supporting industrial chain.

The report analyzes that the OpenClaw craze and the widespread adoption of multimodal AI applications are two core catalysts accelerating the marginal demand for computing power. AI Agents represented by OpenClaw consume 10 to 100 times more Tokens per task compared to ChatBots. Chinese manufacturers are actively deploying and launching OpenClaw-like products, further accelerating their adoption and increasing the corresponding demand for computing power support. AI multimodal applications such as text-to-image and text-to-video continue to explode in popularity. Compared to pure text dialogue, the Token consumption per interaction for image input/generation and video recognition/generation typically increases by orders of magnitude. Chinese multimodal large models like ByteDance's Seedance are rapidly rising, accelerating the boom in China's multimodal AI applications. Data from OpenRouter, the world's largest API aggregation platform, shows that the weekly cumulative Token consumption in April 2026 increased by approximately 7 to 8 times compared to a year ago. Chinese large models are the main driving force behind this surge, with the latest market share reaching about 40%.

The surge in Token calls has led to an enormous explosion in computing power demand. On the supply side, short-term marginal increases are limited by various hard constraints, resulting in a severe computing power shortage in China. Tencent Cloud raised the price of its core Hunyuan series models by over 430% in March and increased the list prices of products like AI computing power, container services, and Elastic MapReduce by about 5% in April. Chinese large models like Kimi frequently display prompts such as "insufficient computing power during peak hours" during use. In the B-end computing power rental market, AI chip rental prices are rising. According to SemiAnalysis data, the one-year lease contract price for H100 has increased from a low of about $1.70/hour/GPU in October 2025 to $2.35/hour/GPU in March 2026, a rise of nearly 40%. Since February, leading Chinese cloud and model manufacturers have explicitly and publicly mentioned the scarcity of computing power resources.

Chinese computing power is facing accelerated growth opportunities on the inference side. CITIC Securities points out that AI chip demand can be divided into training and inference. The current explosion of Agent and multimodal applications is significantly driving demand for inference-side computing power. From the perspective of the difficulty of introducing Chinese computing power, inference tasks have lower overall performance requirements for computing power products compared to training tasks. By deeply collaborating with internet companies and customizing optimizations for specific needs, Chinese computing power chip manufacturers can provide inference chips better suited to internet companies' requirements. The pace of domestic substitution is progressing faster on the inference side compared to the training side. Currently, Chinese large models have begun actively adapting to domestic computing power chips on the inference side. Zhipu's GLM-5 has completed deep inference adaptation and operator-level optimization with mainstream Chinese chip platforms such as Huawei Ascend, Moore Threads, Cambricon, Kunlunxin, MetaX, Enflame, and Hygon. The DeepSeek V4 model has achieved deep adaptation with Chinese chips like Huawei Ascend for the first time.

Policy support for the domestic computing power industrial chain continues to intensify. In March 2026, the Shenzhen Municipal Industry and Information Technology Bureau released the "Shenzhen Action Plan for Accelerating the High-Quality Development of the Artificial Intelligence Server Industrial Chain (2026-2028)". Focusing on eight key areas, the plan supports the accelerated application and iteration of core domestic chip products such as GPUs, NPUs, CPUs, and DPUs, accelerates breakthroughs in advanced packaging technologies for storage chips, prioritizes the development of high-end storage products like enterprise SSDs and enterprise memory modules, and strengthens R&D in new technologies such as near-memory packaging and memory-compute integration. CCID Consulting estimates that by 2026, China's total computing power scale will exceed 1200 EFLOPS, firmly ranking second globally, with intelligent computing power contributing nearly 90%.

Data from the China Business Industry Research Institute shows that from 2026 to 2030, China's GPU market will enter a period of rapid volume expansion, with the market size expected to rise from 205 billion yuan to 542 billion yuan. It is projected that around 2028, the market share of domestic GPUs in the inference market will exceed 40%, and in the training market, it will surpass 25%. After 2027, with the mass production of 3nm domestic GPUs and the widespread adoption of Chiplet technology, the cost per unit of computing power is expected to drop by over 30%, driving the transformation of GPUs from high-end research equipment to general-purpose productivity tools.

This article is compiled by Wedoany. All AI citations must indicate the source as "Wedoany". If there is any infringement or other issues, please notify us promptly, and we will modify or delete it accordingly. Email: news@wedoany.com