Amongst 592 models developed from 143 articles, including 140,767 HNC patients, only 49 (8%) models from six articles were judged to have low ROB and low concerns for applicability. No external validation was performed for 480 models (81%). For the remaining 112 models and six additional models which were not eligible for the present review, 152 external validations were performed in 34,304 patients with HNC in 41 articles. The results of models externally validated at least twice are discussed below.
Models for xerostomia
Amongst 275 models for xerostomia, two models were externally validated at least twice.
The Beetz 2012b model for xerostomia six months after radiotherapy was validated in two studies. C-statistics ranged from 0.70 to 0.74. Calibration performance was reported in one study. One validation study was rated as having low ROB in all domains, while the other was rated as having high ROB in the analysis domain.
The Cavallo 2021 model for acute xerostomia during radiotherapy for patients with nasopharyngeal cancer was externally validated in the same study, using two different types of cohorts. C-statistics ranged from 0.68 to 0.73 and calibration plots were reported in both cohorts. Both validations were rated as having unclear ROB in the participants’ domain because no detailed information about recruiting was provided.
Models for dysphagia
Amongst 86 models for dysphagia, two models were externally validated at least twice.
The Christianen 2012 model for dysphagia six months after radiotherapy was validated in five studies. C-statistics ranged from 0.66 to 0.75. Calibration performance was assessed in all of them, while four of them were rated as having high ROB in the analysis domain due to the small sample size.
The Wopken 2014b model for tube feeding dependence six months after radiotherapy was validated in three external validation studies. C-statistics ranged from 0.79 to 0.95, while calibration was evaluated in all studies. Due to the small size of the validation datasets, they were judged as having high ROB in the analysis domain.
Models for hypothyroidism
Of 66 models for hypothyroidism, two models were externally validated at least twice. In addition, there was another model which was not originally developed for patients with HNC, but validated in this domain.
The Boomsma 2012 for hypothyroidism within two years after radiotherapy was externally validated in two studies. C-statistics ranged from 0.64 to 0.74, while only one study reported its calibration performance. Both validation studies were rated as having high ROB in the analysis domain.
The Ronjom 2013 model for radiation-induced hypothyroidism was validated in three studies. C-statistics ranged from 0.65 to 0.69 and calibration plots were reported in only one study. Two validation studies were judged as having high and the other was rated as having unclear ROB in the analysis domain.
The Cella 2012 model was originally developed to predict radiation-induced hypothyroidism in patients with Hodgkin’s lymphoma. In two validation studies in patients with HNC, c-statistics ranged from 0.65 to 0.68, but calibration performance was not reported. One validation study was rated as having a high ROB and the other was rated as being unclear in the analysis domain.
Models for temporal lobe injury
Amongst six models for temporal lobe injury, two were externally validated at least twice.
The OuYang 2023 model, using deep learning in patients with nasopharyngeal cancer, was validated in the same paper using two different cohorts. C-statistics ranged from 0.80 to 0.82, while calibration performance was assessed in both cohorts. Both validations were judged as having low ROB in all domains.
The Wen 2021 model was developed to predict temporal lobe injury in newly diagnosed nasopharyngeal cancer patients. The model was validated by OuYang 2023 using two cohorts. C-statistics ranged from 0.77 to 0.79, while calibration performance was not reported. Both validations were judged as having unclear ROB in the analysis domain.
Models for outcomes related to hoarseness, fatigue, nausea-vomiting, throat pain, aspiration
No models were externally validated at least twice.