Artificial intelligence in diabetes care requires better benchmarks
A correspondence published in The Lancet Diabetes & Endocrinology highlighted the need for validated clinician-informed benchmarks to enable effective use of artificial intelligence in interpreting continuous glucose monitoring data.
Continuous glucose monitoring (CGM) provides standardized metrics, including time in range, hypoglycemia, and hyperglycemia. However, the interpretation of these data remains variable in clinical practice, lead correspondence author David C. Klonoff, MD, and colleagues argued.
Differences reflect patient-specific factors (such as insulin regimen, age, and health beliefs), behavioral influences (including diet and physical activity), and contextual features within glucose profiles (such as time-of-day patterns and trends). As a result, even experienced clinicians may reach different conclusions when assessing the same CGM data.
This variability poses a challenge for AI systems designed to support clinical decision-making, according to the authors. For AI tools to be clinically meaningful and safe, they must be trained on datasets that reflect expert clinical reasoning and judgment. This requires the development of “gold standard” benchmarks derived from clinician interpretation.
One such benchmark, they cited, is the Glycemia Risk Index (GRI), a composite metric that integrates multiple CGM parameters into a single score reflecting overall glycemic quality. Developed using clinician assessments of CGM tracings across different diabetes types and treatments, the GRI correlates more closely with clinician judgment than individual metrics and may serve as a useful reference standard for evaluating AI models.
The authors also recommended developing additional benchmarks tailored to specific patient subpopulations.
Full disclosures of the researchers are listed here.
AACE Endocrine AI is published by Conexiant under a license arrangement with the American Association of Clinical Endocrinology, Inc. (AACE®). The ideas and opinions expressed in AACE Endocrine AI do not necessarily reflect those of Conexiant or AACE. For more information, see Policies.