Swiss Team Develops 7-Billion-Parameter TutorRL Model, Balancing Subject Knowledge and Teaching Skills

2026-06-15 16:14

Favorite

en.Wedoany.com Reported - Swiss postdoctoral researcher Jakub Mačina, in collaboration with informatics professor Mrinmaya Sachan and learning scientist Manu Kapur, has developed an AI learning model named "TutorRL" designed to balance subject expertise with teaching skills. The model requires only 7 billion parameters, far fewer than current mainstream large language models with hundreds of billions or even trillions of parameters, and is less prone to deviating from the topic during up to 20-step learning interactions.

Mačina's research focuses on how to transform large language models into learning coaches with pedagogical value. He points out that most existing large language models are optimized to generate answers and solutions, rather than guiding users to think independently during the learning process. Even when prompts explicitly request learning support, the results are often unsatisfactory. To test the teaching suitability of different models, Mačina, along with researchers from the Technical University of Darmstadt (TU Darmstadt), developed a math teaching benchmark called "MathTutorBench." This benchmark, based on dialogues with teachers and teaching process data, establishes a scoring system for specific teaching skills to compare and analyze responses from large language models. Tests revealed that different models often exhibit a trade-off between subject knowledge and teaching skills, and most models tend to lose track and deviate from the topic when providing step-by-step responses.

In a second project, Mačina developed the TutorRL model. This model is trained through multi-step interactions between a virtual student and a virtual teacher, eliminating the need for expensive training data. During training, another model monitors the teaching process and evaluates the virtual teacher's responses, enabling "reinforcement learning." Mačina notes that the major advantage of this approach is that it does not require massive amounts of data and can use smaller language models. Compared to the latest models from OpenAI or Google, which have hundreds of billions or even trillions of parameters, TutorRL's 7-billion-parameter scale is much smaller. Preliminary results show that TutorRL achieves a better balance between subject knowledge and teaching skills than traditional large language models and is less prone to deviating from the topic. The model can also explain the reasons behind its answers and decisions during the learning process, making it easier for teachers to understand and monitor the teaching process.

TutorRL is now available for free as an open-source model, with over a thousand downloads. However, the model has not yet been tested or evaluated with classroom learners and is currently only suitable for high school and early undergraduate mathematics instruction. Mačina believes that in the long term, the model could also be applied to MINT subjects such as mathematics, informatics, natural sciences, and technology, and its performance is sufficient to support master's-level courses. He states that the research is not only relevant to education but also fundamentally significant for the further development of artificial intelligence, as collaborative problem-solving will become central to many future work domains, and human judgment will remain crucial.

This article is compiled by Wedoany. All AI citations must indicate the source as "Wedoany". If there is any infringement or other issues, please notify us promptly, and we will modify or delete it accordingly. Email: news@wedoany.com