en.Wedoany.com Reported - Shenzhen-based robotics company X Square Robot has released XRZero-G0, an open-source hardware and software framework for collecting robot training data from human operators, generating policies, and testing them on physical robots. The code is licensed under MIT and hosted on GitHub alongside the G0-Dataset.

Traditional methods rely on physical robots to collect training samples, with each operation session yielding extremely limited demonstration data, directly constraining the scale of datasets needed to train embodied AI. Human demonstrators offer a lower-cost data source, and X Square Robot has integrated this approach into a publicly available system. The company develops robots for physical labor scenarios, where previously such firms had to invest significant time and capital in manually operating machines to gather training samples.
Physical robots perceive their environment through multiple cameras. Head-mounted cameras capture wide-area scenes, while wrist-mounted cameras meticulously record hand-object interaction details. Many manual operation collection setups rely solely on wrist views, creating a mismatch between training data and the robot's actual perception during deployment. XRZero-G0 uses a head-mounted camera and two wrist cameras to simultaneously record broad scene context and close-up fine manipulations, mapping these perspectives into a shared representation that matches robot perception. Paired with a wearable VR interface and interchangeable grippers, a single operator can generate demonstration data applicable to different robot embodiments.
Data from human demonstrators may suffer quality issues affecting its training value. XRZero-G0 builds a closed-loop process encompassing collection, inspection, training, and evaluation to filter samples entering the training phase. At the observation level, multi-view geometric consistency constraints reduce misalignment between images and motion; at the kinematic level, a full-body inverse kinematics algorithm with collision and joint limit constraints eliminates invalid trajectories; at the policy level, playback execution on physical robots serves as final validation. According to X Square Robot, under controlled settings, the system's effective data yield approaches 85%.
The company notes that robot-free and real robot data can work synergistically. Combining approximately 10 human-collected demonstration segments with 1 real robot-recorded segment achieves performance comparable to a training set composed entirely of real robot data in test tasks. Human-collected data provides broad behavioral coverage, while a small amount of real robot data anchors physical parameters such as motor latency and friction. Under test conditions, this ratio reduces the demand for real robot data by up to 20 times.
The G0-Dataset contains over 2,000 hours of validated demonstrations covering visual, tactile, and auditory modalities, spanning 3,000 distinct manipulation tasks with a long-tail distribution. Operator peak data collection speed reaches 93.2 segments per hour. The dataset supports large-scale pre-training and cross-robot embodiment transfer research. X Square Robot states that policies trained on this framework can generalize to collection environments with varying robot poses, table heights, and viewpoints, and demonstrate zero-shot transfer capabilities to robot platforms outside the training set, executing tasks without fine-tuning for new platforms.
This article is compiled by Wedoany. All AI citations must indicate the source as "Wedoany". If there is any infringement or other issues, please notify us promptly, and we will modify or delete it accordingly. Email: news@wedoany.com









