South Korea's Three Major AI Semiconductor Companies Accelerate Commercial Deployment in the Inference Sector
2026-07-01 10:27
Favorite

en.Wedoany.com Reported - As the focus of the AI infrastructure market shifts from large-scale training to inference, South Korea's domestic AI semiconductor companies are rapidly expanding their footprint with unique architectures and real-world use cases, seeking to exploit weaknesses in NVIDIA's position within the global next-generation infrastructure market.

The leading companies in South Korea's AI semiconductor camp include Rebellions, Mobilint, and HyperAccel, each competing based on different target markets and technological paths. Rebellions has established an independent position through mass production of high-performance chips and large-scale commercialization. Its next-generation flagship product, 'REBEL100', employs an advanced Chiplet architecture connecting four chips and features fifth-generation HBM3E memory, achieving high-performance computing comparable to existing flagship GPUs while ensuring excellent power efficiency. Rebellions' NPU has been deployed in SK Telecom's 'A.' call recording summary service, which generates up to 50 million API calls daily. Currently handling an average of 20 million calls per month and 700,000 per day, Rebellions' NPU has replaced existing GPUs and operates stably. Additionally, its products are used in 'Excalibur', a pet AI diagnostic assistance service utilized by over 1,000 animal hospitals nationwide.

Rebellions NPU (Image source: Rebellions)

Mobilint targets the entire inference market, from data centers to edge devices, designing high-performance, low-power NPUs. Its representative product, 'ARIES', delivers up to 80 TOPS of computing performance with a power consumption of only about 25W. Mobilint operates an NPU-based AI consulting service platform in collaboration with AI contact center company MetaM, works with industrial AI firms like POSCO DX to build and validate customized AI infrastructure for manufacturing sites, and has supplied the standalone AI 'MLX-A1' to Yonsei University. Recently, Mobilint successfully completed the 'Edge AI Service Validation and Diffusion Project' supported by the Ministry of Science and ICT (MSIT), deploying its NPUs 'ARIES' and 'REGULUS' in edge devices such as forest fire surveillance cameras and drones to build disaster management infrastructure capable of real-time fire detection and path prediction via 3D maps. Mobilint plans to launch 'REGULUS', the first standalone AI SoC among South Korean NPU companies, in the second half of this year.

HyperAccel has designed its 'LPU (LLM Processing Unit)' specifically for generative AI and LLM inference workloads from the outset, using relatively inexpensive, low-power LPDDR5x memory while maximizing bandwidth utilization to achieve cost efficiency in latency, power efficiency, and TCO. Starting with the 'Orion' server, HyperAccel is advancing a product roadmap targeting data centers and edge computing, collaborating with Naver Cloud to build AI inference infrastructure optimized for data center environments and with LG Electronics to apply inference technology to various device environments. Through partnerships with global key players such as Samsung Electronics, SemiFive, Advantech, INVENTEC, and HPE, it is simultaneously demonstrating technological competitiveness and commercial scalability.

HyperAccel LPU (Image source: HyperAccel)

All three companies have seized the shift in infrastructure focus from training to inference and AI agents, designing efficient architectures for inference workloads to maximize TCO value. To break the hardware and software lock-in centered on NVIDIA, they have fully embraced open-source ecosystems, actively supporting proprietary software stacks or SDKs to enable developers to directly use frameworks like PyTorch, Hugging Face, vLLM, and Triton without complex porting or code modifications. All three are committed to accumulating real-world reference cases and building global alliances, leveraging domestic successes such as large-scale call summary services, AI consulting, and joint development with Naver Cloud to begin global expansion.

As the AI infrastructure market focus shifts to inference, domestic NPU cloud services (NPUaaS) combining cost-effectiveness and technological sovereignty are gaining attention. Over 55% of global AI infrastructure spending is concentrated on inference, with 80% to 90% of total lifecycle costs coming from inference, driving growing demand to replace high-cost GPUs. In April this year, Gabia officially launched NPUaaS equipped with Rebellions' 'ATOM-Max', a chip that achieved processing speeds 1.5 to 3 times faster and energy efficiency 3 to 4.5 times higher than comparable GPUs in the global AI performance evaluation competition 'MLPerf'. KT Cloud has officially launched 'NPU Server' products compliant with security regulations for public institutions and public AI agent solution providers, securing and operating approximately 300 NPU accelerators, with plans to add Rebellions' next-generation chip 'REBEL100' after its mass production in the third quarter of this year. Samsung SDS plans to launch NPUaaS based on FuriosaAI's second-generation NPU 'RNGD', integrating RNGD servers directly with the virtualization layer of the Samsung Cloud Platform (SCP) through hardware virtualization technology.

CSPs agree that future AI infrastructure will enter a heterogeneous computing environment where GPUs, NPUs, and TPUs coexist, and 'sovereign clouds' aimed at maintaining data sovereignty also present positive signals for domestic NPUs. To ensure NPUs provide a developer experience comparable to NVIDIA's 'CUDA', current government K-Cloud project support is aligning with software investments from semiconductor manufacturers. Industry insiders emphasize that CSPs, as the final transmitters realizing the technological value of domestic AI semiconductors on the ground, are shifting their role towards integrating service design, inference optimization consulting, platform automation, and security monitoring.

This bulletin is compiled and reposted from information of global Internet and strategic partners, aiming to provide communication for readers. If there is any infringement or other issues, please inform us in time. We will make modifications or deletions accordingly. Unauthorized reproduction of this article is strictly prohibited. Email: news@wedoany.com