en.Wedoany.com Reported - South Korean software companies are reducing the sharp rise in token costs caused by the proliferation of AI agents by combining various technologies such as prompt optimization, LLM gateways, on-premises deployment, and multi-model strategies.
![[Image source: Generated by nanobanana2]](https://img.wedoany.com/2026/0702/20260702085636297.png)
To complete tasks, AI agents repeatedly call language models and execute various tools on their own, leading to token consumption that surges several to dozens of times compared to human usage. Since the beginning of this year, one South Korean company has deployed AI agents across the entire organization, consuming approximately 250 billion tokens per month, resulting in monthly infrastructure costs of 200 million to 300 million Korean won.
Some companies are starting with prompt lightweighting and caching. WISEITECH reduces unnecessary long inputs and repeated calls, while Naver Cloud optimizes models based on tasks. Companies are positioning LLM gateways as core control nodes for real-time monitoring of model usage across departments. Hancom integrates routing and fallback systems, and NDS builds gateways based on LiteLLM.
On-premises deployment solutions are also being adopted by multiple companies. MakinaRocks connects open-source models with its own vLLM infrastructure, and S2W uses self-built GPU servers to handle high-volume tasks. Multi-model combination strategies assign standardized repetitive tasks to lightweight or open-source models. Crowdworks utilizes commercial models like Amazon Bedrock in conjunction with mini models. CyNapse Soft has introduced Serena MCP and LSP technologies to segment source code by semantic units, achieving approximately 20% token savings compared to open-source frameworks.
Cost optimization in the generative AI era tests companies' architectural design capabilities. By eliminating duplicate requests through caching, isolating sensitive data via on-premises deployment, and replacing high-cost models through routing, building a comprehensive control system will become a benchmark for distinguishing the sustainability of software companies.









