Deep learning model uses hand images to improve acromegaly detection
A deep learning (DL) model accurately identified acromegaly from hand photographs while preserving patient privacy by avoiding facial and fingerprint features, according to a study published in The Journal of Clinical Endocrinology & Metabolism.
“This study highlights the potential of AI to enable accurate disease identification based solely on external physical traits, without compromising patient anonymity,” wrote lead researcher Hidenori Fukuoka, MD, of the Division of Diabetes and Endocrinology, Department of Internal Medicine at Kobe University Hospital in Kobe, Japan, and colleagues. “The ability of such a model to assist healthcare providers who may not have specialized training represents a significant step forward in diagnostic accuracy and in promoting equitable health care delivery.”
Researchers recruited 716 patients (n = 317 with acromegaly, n = 399 controls; median ages 57 and 56, respectively) from 15 Japanese pituitary centers. Among patients with acromegaly, 82% had undergone surgery and 59% were in biochemical remission at the time of imaging.
The deep learning model, based on a modified ResNet-50 architecture, was trained in PyTorch with data augmentation and 5-fold cross-validation. The dataset was divided by facility, with 12 centers contributing to training/validation and three centers providing an independent test set.
The model analyzed morphological features of the dorsal hand and “fist sign”—defined as the inability to cover the fingernails with the center of the palm when making a fist—with gradient-weighted class activation mapping confirming attention to relevant anatomical regions. Images excluded the palm and fingerprints to minimize identifiability. A total of 11,480 images were collected, and predictions were averaged across four images per patient.
At the optimal threshold, the model achieved an 89% sensitivity and 91% specificity, with an F1 score of 0.89 and an area under the receiver operating characteristic curve (AUROC) of 0.96. It outperformed 10 board-certified endocrinologists, whose F1 scores ranged from 0.43 to 0.63 on the same test set.
Subgroup analyses showed consistent performance across age, remission status, and sex. Sensitivity and specificity were 89% and 90%, respectively, in patients younger than 57 years and 86% and 95%, respectively, in older patients. Performance was similar in patients with and without remission (AUROC of 0.94 vs 0.97) and in male vs female patients (AUROC of 0.97 vs 0.94; F1 score of 0.93 vs 0.84, respectively).
Error analysis identified seven false positives, most of which endocrinologists correctly classified, and six false negatives, which were also frequently misclassified by physicians, suggesting these cases were inherently difficult.
The researchers noted several limitations which may limit generalizability: incomplete biochemical confirmation in some control participants, a higher prevalence of acromegaly than in general practice, and restriction to Japanese patients from specialist centers. Many patients with acromegaly were also in remission at the time of imaging.
The study was funded by the Hyogo Science and Technology Association. The researchers reported no conflicts of interest.
AACE Endocrine AI is published by Conexiant under a license arrangement with the American Association of Clinical Endocrinology, Inc. (AACE®). The ideas and opinions expressed in AACE Endocrine AI do not necessarily reflect those of Conexiant or AACE. For more information, see Policies.