Innovative Tests of Artificial Intelligence in Mathematical Research
2026-02-10 09:28
Favorite

Wedoany.com Report on Feb 10th, Recently, ten renowned mathematicians collaborated on an experiment to assess the innovative potential of artificial intelligence in the field of mathematical research. Each of them contributed an unpublished mathematical problem from their own area of study, one that had no publicly available solutions online. The aim was to test whether current advanced AI models could go beyond the scope of their training data and demonstrate the creative ability to independently solve new problems.

The research team employed two language models: OpenAI's ChatGPT 5.2 Pro and Google's Gemini 3.0 Deep Think. They granted these models unrestricted internet search access to examine their practical reasoning and problem-solving capabilities when faced with complex mathematical challenges, rather than relying solely on pre-learned information.

In preliminary tests, Fields Medal-winning mathematician Martin Hairer shared his observations: AI performed well in connecting known arguments and executing computations but showed limited ability in conducting truly original research. Hairer noted, "So far, I haven't seen a convincing example where a language model generated a genuinely novel idea or a fundamentally new concept." He compared AI to a "weak student who knows the starting point and the destination but can't find the right path."

Other mathematicians participating in the experiment reported similar experiences. Mathematician Tamara Kolda pointed out that AI lacks independent perspectives, making it difficult to serve as a genuine research collaborator. Hairer also mentioned that AI often exhibits overconfidence, and its provided solutions frequently require significant effort to verify. This is akin to a student who might be better at producing superficially plausible but substantively lacking arguments.

This experiment not only serves as an assessment of AI's technological capabilities but also aims to clarify the common misconception that "mathematics has been solved by AI." The researchers hope to alleviate concerns among students and young scholars about the potential replacement of mathematical academic careers by artificial intelligence. Since last week, these ten problems have been made publicly available online for researchers worldwide to test and discuss before the official solutions are released on February 13th.

The experiment does not end here. The research team plans to launch a second round of tasks in a few months. By incorporating feedback gathered from the first round, they intend to construct a more objective and systematic benchmark for evaluating AI's mathematical capabilities. This ongoing research effort will contribute to a more comprehensive and in-depth assessment of AI's actual role and limitations in advancing the frontiers of mathematical research.

This bulletin is compiled and reposted from information of global Internet and strategic partners, aiming to provide communication for readers. If there is any infringement or other issues, please inform us in time. We will make modifications or deletions accordingly. Unauthorized reproduction of this article is strictly prohibited. Email: news@wedoany.com