CAICT Launches Series Standards Development and Quality Evaluation for High-Quality Medical Specialty and Disease-Specific Datasets
2026-06-15 14:53
Favorite

en.Wedoany.com Reported - Advancing the construction of high-quality datasets is a crucial measure to implement national data infrastructure development, market-oriented allocation reform of data elements, and the digital and intelligent transformation deployment in healthcare. In accordance with the National Data Administration's "Implementation Plan for Promoting the Construction of High-Quality Datasets in Industries" (Guo Shu Ke Ji [2026] No. 25), which focuses on advancing high-quality dataset construction in fields such as scientific research and healthcare, and strengthening quality evaluation and mutual recognition of results, the China Academy of Information and Communications Technology (CAICT) is accelerating the construction of high-quality datasets in the healthcare sector and various tasks related to national data infrastructure. Through multiple measures, CAICT aims to leverage the significant role of high-quality datasets in model training, intelligent agent development, and clinical decision support within the healthcare industry, forming a key lever to improve data supply quality, unlock the value of data elements, and enhance public service capabilities.

Currently, the medical industry generally faces core pain points such as inconsistent specialty data standards, uneven quality, and difficulty in realizing value, which severely hinder the transformation of medical data into high-value assets. Against this backdrop, CAICT has officially launched the series standards development and quality evaluation work for "High-Quality Medical Specialty and Disease-Specific Datasets." Leveraging its deep expertise in data governance and evaluation, CAICT is committed to providing high-quality data support for specialty medical artificial intelligence construction and real-world research, promoting the compliant and secure transformation of medical data into high-value data assets.

I. Building a Tiered Standard System Covering "24 Specialties + 18 Diseases"

Based on general dataset construction requirements and combined with actual clinical practice, CAICT has established a high-quality dataset standard system covering 24 key clinical specialties and 18 common key diseases. Following the principle of prioritizing industrial urgent needs, the standards will be initiated and developed in a phased manner.

First Batch Initiation: In collaboration with Academician Zhong Nanshan's team, the standard for high-quality respiratory medicine datasets has been initiated, launching the development of construction specifications for high-quality specialty datasets in the healthcare industry. Simultaneously, in cooperation with chairpersons of several specialty physician branches/committees under the Chinese Medical Doctor Association, standard initiation work has been completed for high-quality datasets in areas with large application bases and strong demand for AI application implementation, such as stroke specialty, heart failure, and coronary heart disease.

Planned Progress for 2026: Accelerate the construction of dataset standards in high-demand areas such as internal medicine of traditional Chinese medicine, infectious diseases, burn surgery, dermatology, and sudden cardiac death.

Long-Term Advancement: Continue to cover key specialties and diseases such as pediatrics, oncology, lung cancer, and breast cancer, ultimately building a refined dataset matrix covering core clinical business scenarios.

Framework of the High-Quality Dataset Standard System for Specialties and Diseases

II. Launching the First Batch of Quality Evaluation for Medical Specialty and Disease-Specific High-Quality Datasets

The quality evaluation work is conducted based on the "Technical Document for Quality Evaluation Specifications of High-Quality Datasets" and the series of specialty and disease-specific standards. It involves a comprehensive assessment across 17 indicators within three core dimensions: documentation, data quality, and model application, along with in-depth verification and compliance review of specialty and disease-specific indicators.

Evaluation Indicators from the "Technical Document for Quality Evaluation Specifications of High-Quality Datasets"

The first batch of quality evaluation for medical specialty and disease-specific high-quality datasets is officially launched today, prioritizing areas such as respiratory medicine, stroke specialty, heart failure, and coronary heart disease. It is expected that document review, verification of specialty technical indicators, compliance checks, and report generation will be conducted between July and August 2026.

For high-quality datasets that pass the evaluation, CAICT will issue a comprehensive assessment report and an evaluation certificate, and recommend their inclusion for practical application in multiple national AI bases (medical). Additionally, CAICT will actively assist participating entities in connecting with various local data exchanges to facilitate the listing, circulation, and trading of high-quality medical datasets. Furthermore, CAICT will continue to build bridges for industry-academia-research-application collaboration, organizing sharing and exchange activities on dataset construction practices among participants to jointly promote the prosperous development of the medical data element ecosystem.

Next, CAICT will continuously deepen and improve the standard matrix and quality evaluation system for high-quality medical specialty and disease-specific datasets. We welcome all parties to participate in this endeavor, laying a solid and reliable data foundation for the application of medical large models and the digital and intelligent transformation of the industry.

This article is compiled by Wedoany. All AI citations must indicate the source as "Wedoany". If there is any infringement or other issues, please notify us promptly, and we will modify or delete it accordingly. Email: news@wedoany.com