en.Wedoany.com Reported - Saturn Cloud, an AI development platform for GPU cloud operators, has launched the Token Factory platform, enabling enterprise AI teams to complete the full workflow of model fine-tuning and inference services on operator GPU infrastructure. The platform supports neocloud operators, AI factory builders, and enterprise users in providing managed fine-tuning tasks, dataset management, and OpenAI-compatible inference endpoints for their customers, all billed by token and delivered under the operator's own brand, without requiring any self-development or maintenance of components.
GPU cloud operators have invested heavily in accelerated infrastructure, with NVIDIA Grace Blackwell, NVIDIA Blackwell, and NVIDIA Hopper systems deployed at scale, driving rapid revenue growth in neocloud businesses. However, many operators' business models remain limited to renting GPU compute by the hour. Enterprise customer needs have evolved beyond raw compute output, requiring managed development environments, distributed training orchestration, model fine-tuning pipelines, single sign-on (SSO) and role-based access control (RBAC), usage tracking, and compliance tools. Most GPU cloud operators lack the manpower to build these platform infrastructures internally, which typically requires months of engineering development and ongoing maintenance.
Sebastian Metti, founder of Saturn Cloud, stated that operators should not have to build an AI development platform from scratch to make GPU infrastructure accessible to enterprise teams. Saturn Cloud provides managed environments, training orchestration, fine-tuning, OpenAI-compatible inference endpoints, and per-token billing from the outset.
The Token Factory platform enables AI teams to fine-tune and serve open models without managing infrastructure. Users simply upload datasets, configure fine-tuning tasks, and deploy the resulting models to inference endpoints, all within the operator's branded environment. Fine-tuning tasks support supervised fine-tuning (full weight and LoRA) on open models, and when the selected instance is equipped with multiple GPUs, the system automatically performs DeepSpeed multi-GPU configuration. Users specify the base model, dataset, and a few hyperparameters, and Saturn Cloud generates the complete training configuration, handling orchestration, retries, and checkpoint output. Supported training frameworks include Axolotl, vLLM, Unsloth, TRL, PEFT, and DeepSpeed.
Datasets are typed, validated collections of training data, formatted as conversational, instructional, text, or pre-tokenized. Users can upload datasets directly, import them from external sources (such as S3, NFS), or curate data in managed workspaces, then register them as Token Factory datasets. All dataset storage uses high-performance parallel file systems rather than object storage to eliminate cold start overhead and avoid reducing GPU utilization during training.
Checkpoint and artifact lineage are automatically managed. After a fine-tuning task completes, the generated checkpoints are registered in Saturn Cloud's artifact registry, preserving the full lineage from training run to model weights. Checkpoints can be immediately used as input for inference endpoint deployment. Inference endpoints deploy fine-tuned or base models as persistent service endpoints, backed by vLLM, with each deployment having a dedicated subdomain, health monitoring, and per-token metering. Service configurations (such as dtype, maximum context length, quantization) are generated at deployment time without requiring custom service scripts. The entire workflow is isolated by organization, with Token Factory resources scoped to tenants, ensuring that one customer's datasets, checkpoints, and endpoints are invisible to others.
Saturn Cloud provides GPU cloud operators with a one-stop path from bare metal infrastructure to a revenue-generating AI platform. The operator-facing feature layer includes white-label branding, per-token and per-GPU-hour billing infrastructure, tenant onboarding and self-service provisioning, usage dashboards and billing reports, and enterprise security tools (covering SSO, RBAC, and SOC 2 compliance). Without the platform layer, operators can only sell compute hours, leading to price competition; with Saturn Cloud, they can sell a platform, competing on developer experience, security posture, and time to market. The platform enables operators to pass enterprise security reviews, as compliance tools are already in place, while allowing operators to present usage panels, cost controls, and team management to tenants, and equipping sales teams with product demonstrations rather than spec sheets.
AI teams and developers working on operator infrastructure gain access to managed development environments (supporting JupyterLab, VS Code, RStudio, and SSH access), distributed multi-GPU training (with orchestration, retries, and logging), Token Factory for fine-tuning and serving open models, and pre-configured support for NVIDIA CUDA, GPU drivers, and AI frameworks. Engineers can utilize the operator's full GPU cluster, including NVIDIA Hopper, Blackwell, and Blackwell Ultra systems, as well as NVIDIA GB200 NVL72 rack-scale systems. Saturn Cloud is a member of the NVIDIA Inception startup acceleration program.
Saturn Cloud integrates with infrastructure automation partners in the ecosystem, including Mirantis k0rdent AI, Spectro Cloud, OpenNebula, and Rafay. Operators managing Kubernetes directly in the cloud backend can also deploy Saturn Cloud on top of their existing stack without changing the infrastructure layer.
Token Factory features are now available to GPU cloud operators, neoclouds, and enterprises operating their own GPU infrastructure. Organizations interested in deploying the platform can contact Saturn Cloud for an evaluation.
Saturn Cloud is an AI token factory platform for neoclouds, AI factory operators, and enterprises, providing managed fine-tuning, OpenAI-compatible model serving (billed by token), managed environments, distributed training, and enterprise security and governance. The platform supports multiple GPU architectures and can be deployed in public cloud, private cloud, and on-premises environments.
This article is compiled by Wedoany. All AI citations must indicate the source as "Wedoany". If there is any infringement or other issues, please notify us promptly, and we will modify or delete it accordingly. Email: news@wedoany.com









