To solve a problem, we have to see it clearly.
Whether it’s an infection by a novel virus or memory-stealing plaques forming in the brains of Alzheimer’s patients, visualizing disease processes in the body is the first step toward alleviating human suffering. It’s also often the most difficult and costly.
But an artificial intelligence (AI) breakthrough by Virginia Tech computer scientists published Sept. 16 in Cell Systems – a high-impact journal dedicated to biological research – is bringing those fog-bound processes into focus.
The new ProRNA3D-single tool developed by Debswapna Bhattacharya, associate professor of computer science, and his research team offers a new and more accurate way to predict and visualize what’s going on inside us when novel viruses and devastating neurological diseases attack – offering a new pathway to treating them or preventing them altogether.
The ultimate goal is to accelerate the drug discovery process to prevent the RNA viruses from interacting with host proteins, potentially stopping infections before they grow into pandemics or inhibiting altered function of RNA binding proteins in Alzheimer’s disease.”
Debswapna Bhattacharya, associate professor of computer science
A bilingual ChatGPT for biology
For decades, scientists have struggled to understand how viral ribonucleic acid (RNA) binds to human proteins to form complex 3D molecular structures. That’s important because those forms control whether pathogens such as SARS-CoV-2 can spread or if diseases such as Alzheimer’s take hold.
AI systems are helping by creating an “alphabet” to represent DNA, RNA, and proteins, which researchers can then use to train large language models (LLMs) for biological sequences to analyze and simulate how these molecules interact in the body.
But ProRNA3D-single goes further than alphabets. It uses AI to generate finely detailed images of these molecules in 3D.
“The bio LLMs are basically like ChatGPT, but for biological sequences. And just like ChatGPT, we can ask our models questions and get answers,” Bhattacharya said.
The Virginia Tech team took two existing biological LLMs – one for proteins and another for RNA sequences – and created a third model that allows these LLMs to “talk” to each other. Out of those “conversations,” ProRNA3D-single can generate 3D structural models of viral RNA interacting with proteins in the body. It’s a big breakthrough.
“This is basically a neural pairing of two different large language models, leading to bilingual reasoning,” Bhattacharya said. “From a computer science standpoint, that’s a contribution in itself.”
Even recent breakthrough AI models developed by Google DeepMind and others have fallen far short of accurately predicting and modeling protein-RNA complex 3D structures, leaving researchers to rely primarily on costly trial-and-error experiments.
But the new ProRNA3D-single method has significantly increased accuracy and has opened a promising new road to AI-assisted scientific discovery.
Bringing disease into focus
Little is known about how novel viruses such as SARS-CoV-2 evolve or how conditions such as dementia develop at the molecular level, but ProRNA3D-single helps fill those gaps and generate more accurate maps of the inner landscape. Now, instead of guessing, drug developers can analyze where viruses attach to human proteins and design treatments to block them. That could dramatically cut the time and cost of interventions and speed up responses to outbreaks.
“If you remember the COVID-19 pandemic and the mRNA-based vaccine that actually helped a lot – that vaccine worked because it was an RNA-based therapeutic,” said Sumit Tarafder, a fourth-year Ph.D. student on the project. “Modeling of protein-RNA interactions in 3D is crucial, so that we know where the drug can actually target molecules that cause disease.”
Not only that, but by generating new data about RNA-protein interactions, the ProRNA3D-single model creates insights that could lead to groundbreaking treatments for a range of maladies.
While the Virginia Tech team used viruses as a case study, “the method is fully generic. It’s not specific to a single type of virus or a family of viruses,” Bhattacharya said. “This method can be applied to any use case.”
Open science, global impact
Innovative methods like ProRNA3D-single don’t come easily. Two years of work have gone into this project.
Alumnus Rahmatullah Roche, ’24, did much of the coding, publishing more than a dozen papers on the subject during his doctoral work. He has since joined Columbus State University as a tenure-track assistant professor.
“The lead Ph.D. students did enormous work,” Bhattacharya said. “They did most of the heavy lifting.”
Discoveries like these can improve life on a national and even global scale, and as science in the public interest, this project has received funding from the National Institutes of Health and National Science Foundation. Not only is the research paper open access, but Bhattacharya is making the new tool itself freely available for scientists to try for themselves.
“We can’t overstate the importance of investing in science to benefit society. We believe that openness is the key to making science accessible to everybody,” Bhattacharya said. “Taxpayers fund us, so we have an obligation to give back, which is why we make our work open source and publicly available.”
The team hopes to continue development of the tool to improve its accuracy and get even more detailed models of various biological processes.
“We should constantly remind ourselves the problem is far from being solved,” Bhattacharya said. “We made progress, yes, but we’re mindful of the fact that these models have a long way to go.”
Source:
Journal reference:
Roche, R., et al. (2025). Single-sequence protein-RNA complex structure prediction by geometric attention-enabled pairing of biological language models. Cell Systems. doi.org/10.1016/j.cels.2025.101400