US-based Liquid AI Releases 8B Edge Model, Only 1.5B Activated for Inference

2026-06-04 11:51

Favorite

en.Wedoany.com Reported - Liquid AI, an artificial intelligence company spun out of the Massachusetts Institute of Technology, recently released a new model, LFM2.5-8B-A1B. The model has a total of 8 billion parameters, but only activates 1.5 billion parameters per inference, utilizing less than one-fifth of the total parameters in computing power. This model is specifically designed for edge scenarios such as smartphones, PCs, robots, and lightweight servers, and does not compete with large cloud-based models.

Over the past two years, the industry has commonly used methods such as quantization, pruning, and distillation to compress large models originally designed for the cloud and deploy them on edge devices, aiming to run large models on IoT devices. Liquid AI has taken a different technical approach, focusing on changing the model's "feeding method" so that it consumes fewer resources when handling simple tasks and only calls upon more computing power for complex tasks. Specifically, the energy consumed per inference is directly linked to the difficulty of the input task, achieved through the sparse activation mechanism of a Mixture of Experts (MoE) model: when the system faces an input task, it only activates the most relevant expert modules, leaving the rest dormant.

At the edge, the core constraint of intelligence shifts from computing cost to energy cost. The energy consumed (measured in joules) per inference by an embedded chip is limited and fixed. While methods like quantization, pruning, and distillation can reduce model size, they do not change the model's pattern of traversing all parameters per inference, making it unsustainable under strict battery constraints. Liquid AI's technical path involves dynamically adjusting the computing power consumed based on the difficulty of the input task, achieving "input-adaptive computation." This idea originates from research on the nematode Caenorhabditis elegans, which has only 302 neurons in its entire body, yet its intelligence relies on dynamic changes in synaptic connection strength between neurons, rather than on scale accumulation.

The LFM2.5 model retains efficient underlying operators while incorporating the MoE sparse activation mechanism. This is the underlying logic behind its total of 8 billion parameters and the activation of only about 1.5 billion per inference. Liquid AI's technical route has evolved from early continuous-time dynamic networks to the current sparse activation architecture, with the shared core being that computation varies with input. Additionally, this technical school focuses on the robustness of models after deployment. Unlike static models, liquid neural networks model through continuous-time equations and adaptive time constants, allowing their internal states to "flow" and adjust in real-time based on the rhythm of input signals. Multiple demonstrations by the MIT Computer Science and Artificial Intelligence Laboratory (MIT CSAIL) have shown that agents driven by such networks can navigate robustly in unfamiliar environments and cope with environmental drift. Compared to methods relying on OTA remote upgrades to push new models, this natively robust architecture can defend against unknown disturbances that have not yet emerged.

In the era of edge intelligence, industrial value is shifting from the model and chip ends to the synergy layer between them. Liquid AI's LFM model was optimized for hardware compatibility from the outset of its architecture design. The company officially claims it can run seamlessly on GPUs, CPUs, or NPUs, covering heterogeneous devices such as wearables, robots, smartphones, PCs, and automobiles. In January of this year, the company partnered with AMD to customize and deploy a 2.6 billion parameter model locally on the latter's Ryzen AI processor within two weeks. The core capability behind this efficiency is the engineering ability to rapidly approach the optimal operator combination and minimal memory footprint under extreme hardware constraints.

This article is compiled by Wedoany. All AI citations must indicate the source as "Wedoany". If there is any infringement or other issues, please notify us promptly, and we will modify or delete it accordingly. Email: news@wedoany.com

America

Information and Communication Artificial Intelligence Engineering

This bulletin is compiled and reposted from information of global Internet and strategic partners, aiming to provide communication for readers. If there is any infringement or other issues, please inform us in time. We will make modifications or deletions accordingly. Unauthorized reproduction of this article is strictly prohibited. Email: news@wedoany.com

Previous：AI+ Power 2026 Exhibition and Forum to be Held in Hong Kong, China on June 4-5

Next：China Qingfeng Technology AI Social Employment Platform Launched