Researchers develop multiethnic model for identifying individuals with skin cancer

Researchers at the University of California San Diego School of Medicine have developed a new approach for identifying individuals with skin cancer that combines genetic ancestry, lifestyle and social determinants of health using a machine learning model. Their model, more accurate than existing approaches, also helped the researchers better characterize disparities in skin cancer risk and outcomes.

Skin cancer is among the most common cancers in the United States, with more than 9,500 new cases diagnosed every day and approximately two deaths from skin cancer occurring every hour. One important component of reducing the burden of skin cancer is risk prediction, which utilizes technology and patient information to help doctors decide which individuals should be prioritized for cancer screening.

Traditional risk prediction tools, such as risk calculators based on family history, skin type and sun exposure, have historically performed best in people of European ancestry because they are more represented in the data used to develop these models. This leaves significant gaps in early detection for other populations, particularly those with darker skin, who are less likely to be of European ancestry. As a result, skin cancer in people of non-European ancestry is frequently diagnosed at later stages when it is more difficult to treat. As a consequence of later stage detection, people of non-European ancestry also tend to have worse overall outcomes from skin cancer.

To help correct this disparity, the researchers analyzed data from more than 400,000 participants in the National Institutes of Health’s All of Us Research Program, a nationwide initiative aimed at building a diverse database of patient data to inform new, more inclusive studies on a variety of health conditions. By leveraging the participants in the All of Us program, the researchers were able to ensure the data they used had substantial representation from African, Hispanic/Latino, Asian, and mixed-ancestry populations.

Key findings from the study include:

  • The new model includes both genetic and non-genetic determinants, including lifestyle choices, socioeconomic variables and medication usage to stratify individuals based on their likelihood of having skin cancer.



  • The model achieved 89% accuracy in identifying individuals with skin cancer across all populations, with 90% accuracy for individuals of European ancestry and 81% accuracy for people of non-European ancestry.

  • In a subset of participants who had genetic data but were missing data on lifestyle and social determinants of health, the model still retained 87% accuracy.

  • Genetic ancestry, especially the proportion of European ancestry, was a strong predictor of risk; individuals of European ancestry were at least 8 times more likely to be diagnosed with skin cancer.

The new model is best framed as a clinical case-finding aid, meaning it can help identify people who should receive full-body skin exams from a dermatologist. This could help enable earlier diagnosis in individuals with darker skin tones, potentially alleviating current disparities in skin cancer outcomes. Additionally, their model may be adaptable to other diseases, paving the way for more equitable, personalized medicine for all. 

The study, published in Nature Communications, was led by Matteo D’Antonio, Ph.D., an assistant professor in the Department of Medicine, and Kelly A. Frazer, Ph.D., professor in the Department of Pediatrics at UC San Diego School of Medicine. Frazer is also a member of UC San Diego Moores Cancer Center. The research was supported by the American Cancer Society, the National Institutes of Health and the Alfred P. Sloan Foundation. The researchers declare no competing interests.

Source:

University of California – San Diego

Journal reference:

D’Antonio, M., G. et al. (2025). A highly accurate risk factor-based XGBoost multiethnic model for identifying patients with skin cancer. Nature Communications. doi: 10.1038/s41467-025-64556-y. https://www.nature.com/articles/s41467-025-64556-y

Continue Reading