Researchers achieve scalable quantum neural network training on IonQ quantum computer
2026-06-10 15:34
Favorite

en.Wedoany.com Reported - A team of researchers has developed a quantum neural network training framework that reduces the cost of computing gradients during training—one of the long-standing major obstacles in the field of quantum machine learning.

According to the study published on the preprint server arXiv, the method reduces the number of circuit evaluations required per optimization step from scaling quadratically with the number of qubits to only logarithmic growth. The researchers stated that this improvement enables direct gradient-based training on IonQ's Forte Enterprise trapped-ion quantum computer and allows them to apply the method to a clinically relevant data imputation task.

According to the team, this work addresses a long-standing challenge in quantum machine learning. The team includes scientists from IonQ, Université Paris Cité, the French National Centre for Scientific Research (CNRS), QC Ware, and Quantum Signals. Quantum neural networks (QNNs) are quantum circuits with tunable parameters, trained in a manner similar to classical neural networks. Theoretically, they may offer advantages in certain learning tasks; however, training them on actual quantum hardware has proven difficult because computing gradients typically requires repeatedly running a large number of quantum circuits. The researchers report that this overhead is one of the main reasons many quantum machine learning demonstrations remain limited to simulations or extremely small-scale hardware experiments.

The framework combines three co-designed components, including a specialized circuit design, a layer-by-layer training strategy, and a parallel gradient computation technique.

Traditional parameter-shift methods, widely used for training quantum circuits, require separate circuit evaluations for individual parameters. As the model size increases, the number of required evaluations grows rapidly. The new framework avoids this bottleneck through three design choices. The first is a circuit architecture called the Butterfly network, inspired by the fast Fourier transform structure, which arranges quantum operations in a specific pattern to propagate information throughout the system while keeping the circuit relatively shallow. According to the study, this design significantly reduces the number of required trainable parameters as the system scales. The second is a layer-by-layer training strategy, which, instead of training every parameter in the quantum neural network simultaneously, first trains smaller circuit blocks and then gradually adds new layers, freezing previously trained layers while optimizing the new ones. The third is a parallelized version of the parameter-shift rule. Since gates within each butterfly layer act on different qubit pairs and commute with each other, the researchers can use a constant number of circuit executions to compute the gradient for the entire layer, rather than evaluating each parameter individually. Together, these techniques substantially reduce the number of quantum circuit evaluations required during training. The researchers reported scaling advantages through an example: applying the traditional parameter-shift method to a 128-qubit butterfly circuit required 1,792 circuit evaluations to compute the gradient, whereas their method required only 28.

To evaluate the framework, the researchers chose clinical data imputation, a problem beyond traditional quantum computing benchmarks. Data imputation involves filling in missing entries in datasets. In medical records, missing information is common due to inconsistent measurement schedules, sensor failures, or incomplete data collection, and accurate imputation can significantly impact downstream predictive models used in medical analysis. The team used the MIMIC-III dataset, a widely studied collection of de-identified intensive care unit records. They introduced missing values into the dataset and then compared various methods for reconstructing the missing information. Baselines included common statistical techniques such as mean imputation and zero-filling, as well as more complex methods like K-nearest neighbors imputation, multiple imputation by chained equations (MICE), MissForest, and the neural network-based Deep MICE model. The researchers indirectly assessed imputation quality by predicting patient survival rates, measured using the area under the receiver operating characteristic curve (AUC). Among classical methods, Deep MICE produced the strongest average performance, achieving an AUC of 0.7176. A hybrid quantum-classical model trained on 16 qubits achieved an AUC of 0.7147, while a 32-qubit hybrid model achieved an AUC of 0.7132, both within a few thousandths of the leading classical result. Although the quantum models did not surpass the best classical baseline, they exhibited a narrow performance range and low variability across multiple runs. The researchers suggested that this stability may indicate beneficial inductive biases brought by the structured butterfly architecture and training protocol.

The study provides a significant demonstration of direct training on a commercial quantum computer. The researchers trained the final layer of a 16-qubit butterfly quantum neural network on IonQ's Forte Enterprise trapped-ion system. The earlier stages of the model were trained in simulation and then integrated into the hardware-trained network. They compared three scenarios: ideal simulation, noisy simulation, and direct hardware execution. According to the results, the performance differences among the three training methods were not statistically significant. The hardware-trained model achieved results comparable to the simulated models while maintaining similar predictive performance. The researchers reported that this demonstrates the logarithmic scaling training framework is robust enough to operate under current hardware noise levels. This finding is important because many previous quantum machine learning demonstrations relied heavily on simulations rather than actual quantum processors, with hardware noise and long training times often making direct optimization impractical. The trapped-ion architecture used by IonQ may have been helpful, as the system provides all-to-all qubit connectivity, allowing the butterfly circuit to be implemented without significant compilation overhead.

The research also explored larger system sizes. Since direct 32-qubit training remains computationally intensive, the researchers used matrix product state tensor network simulations to train larger quantum layers, while inference was performed on IonQ hardware. The resulting 32-qubit hybrid model performed comparably to a classical neural network with an equivalent hidden layer width. The researchers interpreted this as evidence that larger quantum circuits produced through the layer-by-layer framework remain compatible with real hardware and can operate without measurable degradation.

The work includes several important limitations. The study focused on a controlled proof-of-concept imputation task rather than a production-scale medical workflow. Only one feature column was imputed using the quantum model, with the remaining missing values handled by classical methods. The missing data pattern was also generated using a completely random missingness model, whereas real-world clinical data typically exhibits more complex missingness patterns. Finally, the hybrid model matched but did not surpass the strongest classical baseline; the results demonstrate feasibility and competitiveness rather than a clear quantum advantage. The researchers also noted that larger systems may be required before potential performance advantages become apparent. Based on comparisons with classical neural network architectures, they estimated that approximately 128 qubits would be needed to match the representational capacity of the strongest classical model used in the study. Even so, the researchers believe the significance of the framework lies not in the current performance numbers, but in achieving scalable on-hardware training.

The research team includes Natansh Mathur from the Institut de Recherche en Informatique Fondamentale (IRIF), a joint research laboratory of CNRS and Université Paris Cité, as well as QC Ware in France. Co-authors Panagiotis Kl. Barkoutsos, Masako Yamada, and Martin Roetteler are affiliated with IonQ. The research also includes Iordanis Kerenidis, affiliated with IRIF, CNRS, Université Paris Cité, and Quantum Signals.

This article is compiled by Wedoany. All AI citations must indicate the source as "Wedoany". If there is any infringement or other issues, please notify us promptly, and we will modify or delete it accordingly. Email: news@wedoany.com