News Research Predictive Risk Models Thyroid Disease Management Research and Evidence

Machine learning may predict thyroid CLNM

Model "offers a novel strategy for personalized management of cN0 T1–T2 PTC patients."

May 28, 2026 By Matthew Solan 4 min read

An interpretable machine learning model integrating tumor, inflammatory, and thyroid function markers predicted central lymph node metastasis in patients with clinically node-negative T1 to T2 papillary thyroid carcinoma, according to a retrospective study published in Frontiers in Endocrinology.

For the study, which was led by Yalin Zhu, of the Department of Ultrasound at The First Affiliated Hospital of Dalian Medical University in Dalian, China, and colleagues, researchers analyzed data from 710 patients with papillary thyroid carcinoma with 971 lesions undergoing thyroidectomy and central lymph node dissection at the First Affiliated Hospital of Dalian Medical University between January 2020 and June 2024. The training cohort included 568 patients with 776 lesions, and the test cohort included 142 patients and 195 lesions. After correcting sample overlap between training and testing datasets, the final test set included 110 lesions.

Patients met criteria for clinically node-negative T1 to T2 disease and had postoperative confirmation of papillary thyroid carcinoma and central lymph node status.

The researchers evaluated multimodal predictors across four domains: pathology-related features, ultrasound characteristics, thyroid function, and inflammatory markers.

Six variables were selected using the least absolute shrinkage and selection operator (LASSO): age, tumor size, tumor laterality, free triiodothyronine (FT3), platelet-to-lymphocyte ratio (PLR), and systemic immune-inflammation index (SII). These variables were then used to evaluate six learning models: decision tree, K-nearest neighbors, random forest, support vector machine, logistic regression, and gradient boosting decision tree (GBDT).

On the corrected test set, the GBDT model outperformed the other models, achieving an area under the curve (AUC) of 0.812, with 76% accuracy, 84% sensitivity, and 70% specificity.

The final multimodal GBDT model, which incorporated age, tumor size, laterality, FT3, PLR, and SII, outperformed models based on pathology, ultrasound, inflammatory, or thyroid function features alone. A baseline clinical model using only age, tumor size, and laterality achieved an AUC of 0.670, compared with 0.812 for the multimodal model.

Multivariable logistic regression identified 5 independent predictors of central lymph node metastasis (CLNM): bilateral tumors, tumor size greater than 1 centimeter (cm), age 55 years or younger, SII greater than 449.85, and PLR of 134.88 or lower. Bilateral disease was associated with nearly three times the odds of CLNM compared with unilateral disease. Tumor size greater than 1 cm and younger age were also independently associated with higher CLNM risk.

Among the overall cohort, 45% of patients with CLNM had bilateral tumors vs 25% without CLNM, while 37% of patients with CLNM had tumors larger than 1 cm vs 20% without nodal metastasis. Patients with CLNM were also younger, more likely to have multifocal disease, and more likely to demonstrate calcifications on ultrasound.

Although FT3 was not independently associated with CLNM in multivariable analysis, ablation testing showed that inclusion of FT3 improved model performance. Removing FT3 reduced the AUC from 0.812 to 0.731 and reduced calibration performance and decision curve net benefit.

Robustness analyses included alternative feature selection, class imbalance handling, missing data approaches, measurement interval sensitivity, and patient-level clustering assessment. In a patient-level sensitivity analysis of 100 patients with solitary lesions, the model attained an AUC of 0.804.

External validation in an independent cohort of 50 patients from the same institution demonstrated 78% accuracy, 88% sensitivity, and 69% specificity, with an AUC of 0.800.

Decision curve analysis showed positive net benefit across predicted risk thresholds ranging from 0% to 85%, with the greatest clinical utility between 10% and 50%, the range researchers identified as most relevant for decisions regarding prophylactic central lymph node dissection.

To address interpretability, researchers used SHapley Additive exPlanations (SHAP) to quantify feature contributions. The SHAP summary plots ranked laterality, tumor size, age, SII, PLR, and FT3 as the most influential predictors. SHAP analysis showed that higher tumor size, SII, and FT3 values contributed to higher predicted CLNM probability. Younger age and lower PLR also increased predicted metastatic risk.

The researchers noted several limitations, including the retrospective single-center design, limited external validation sample size, operator dependence of ultrasound findings, and the absence of comparisons with newer deep learning methods.

"The model not only demonstrates strong predictive performance, aiding clinicians in accurate preoperative CLNM assessment to avoid under- or overtreatment, but also offers a novel strategy for personalized management of cN0 T1–T2 PTC patients," researchers wrote.

No conflicts of interest were reported.

AACE Endocrine AI is published by Conexiant under a license arrangement with the American Association of Clinical Endocrinology, Inc. (AACE^®). The ideas and opinions expressed in AACE Endocrine AI do not necessarily reflect those of Conexiant or AACE. For more information, see Policies.

Machine learning may predict thyroid CLNM

Related Content