“Our model shows reliable performance on frozen sections during brain surgery and in scenarios with significant diagnostic disagreement among human experts,” he said.
The tool was tested in five hospitals and outperformed both human pathologists and other AI models. A unique aspect of the new model is an “uncertainty detector,” which allows it to not only distinguish between cancer types with high accuracy but also to signal when it’s unsure in its judgment — an important feature for high-stakes medical scenarios.
The new study builds on earlier work led by Yu to develop an AI system that could reliably decode the molecular features of different types of gliomas.
How PICTURE spots brain-cancer doppelgangers
Each year, more than 300,000 people worldwide are diagnosed with tumors in the brain or central nervous system, and more than 200,000 deaths occur as a result. The World Health Organization recognizes about 109 different types of brain and spinal cord tumors, each with its own unique features under the microscope or at the genetic level.
Accurately distinguishing PCNSL from glioblastoma during surgery could allow surgeons to spare brain tissue instead of removing it. Patients with PCNSL are then referred for radiation and chemotherapy, the preferred treatments for this type of tumor. By contrast, glioblastoma requires surgical removal of as much of the cancerous brain tissue as possible.
A near PICTURE-perfect performance
The model — which Yu developed with study co-first authors Junhan Zhao and Shih-Yen Lin — was evaluated on 2,141 brain pathology slides collected worldwide, including rare cases across both frozen sections and formalin-fixed samples. It was designed to spot critical cancer features including tumor cell density, cell shape, and presence of necrosis.
The scientists tested PICTURE’s performance across five international hospitals in four countries. In every case, the AI model outperformed existing AI tools and traditional frozen-section assessment, the standard of care for real-time tumor typing.
In tests, the PICTURE model correctly distinguished glioblastoma from PCNSL more than 98 percent of the time — a level of accuracy that held up when tested in five independent international patient groups. In addition, PICTURE identified samples belonging to 67 CNS cancers that were neither gliomas nor lymphomas.
The model could spot tumors it had not seen during its training and, when it did, it raised a red flag for human review. In other words, the tool knew when it didn’t know, Yu said, and this prevented the system from pigeonholing unclear cases into known categories. This feature renders the model unique among other AI systems, the researchers said. In comparison, other AI tools can differentiate in a binary, either-or fashion — disease A versus disease B. This is especially problematic for brain pathology, Yu noted, because there are more than 100 different subtypes of brain cancers, and many of them are relatively rare.
PICTURE outperformed human pathologists in hard-to-distinguish tumors in the brain. In tests, human specialists showed significant disagreement on difficult diagnoses, with some tumor types misdiagnosed 38 percent of the time. PICTURE correctly identified all these cases, offering support when expert opinion varies.
Launching PICTURE into the real world
Deploying the tool could be a great opportunity for human-AI collaboration, the researchers said. They envision implementing the system across operating rooms and pathology departments as an initial filter to differentiate glioblastoma from PCNSL and inform in-the-OR treatment calls.
Using the model could also democratize access to neuropathology, a highly specialized area of expertise with a dearth of specialists and uneven distribution of experts across the country and world. In addition, the tool can also be used as an educational tool for training the next generation of pathologists to recognize look-alike lesions in the brain where critical differences are obscured under similar appearance.
The researchers noted that most tumor samples were obtained from white patients, so more research is needed to confirm the model’s accuracy across diverse populations. And while the tool focused on glioblastoma and PCNSL, future work could expand it to other cancer types and combine it with genetic and molecular data for deeper insights.