If she’s especially efficient, Hospitalist Gigi Liu, MD, can complete a discharge summary in 20 minutes for a patient she treated for a few days at The Johns Hopkins Hospital, Baltimore. But summarizing visits with patients who stay longer at the Baltimore hospital “can suck up a whole hour,” she said. The summaries help guide patients’ care once they leave the hospital.
“Thirty minutes to an hour [of paperwork] is enough time for me to see another patient,” Liu said.
Liu is hopeful hospitalists will eventually be able to focus more on patient care and less on preparing hospital discharge summaries, among the benefits cited in a new study that used large language models (LLMs) — a form of artificial intelligence (AI) — to draft the narratives. The study by researchers at the University of California San Francisco (UCSF), mostly hospitalists, was published recently in JAMA Internal Medicine.
It compared hospital discharge summaries created by hospitalists with those drafted by LLMs, which are capable of synthesizing large quantities of information into original content similar to what a doctor might create. The study didn’t find a significant difference in the summaries compiled by doctors and the LLM-generated narratives.
The latter were more concise and coherent than the physician-generated counterparts, but less comprehensive, the study reported. The LLM summaries also were more likely to contain errors, including omissions and inaccuracies, but still had a low potential for patient harm, the study found.
The study involved 100 randomly selected inpatient hospital medicine patient encounters of 3-6 days between 2019 and 2022 at UCSF. As part of the blinded study, 22 attending physician reviewers separately evaluated narratives created by hospitalists and those developed by LLM, without knowing which method was used to generate the summaries.
The reviewers didn’t find much difference between the two methods for overall quality and preference, according to the researchers who analyzed the results.
As hospitalists who hand off care to each other after stints of time on service, “we join the patient’s journey abruptly” for a week or longer, said Charumathi Raghu Subramanian, MD, one of the lead authors of the new study.
One duty of the last hospitalist caring for a patient is to discharge them from the hospital, she said.
“The discharge summary is such an important aspect of the patient’s journey.” It summarizes clinically important aspects of a patient’s care as they transition from the hospital to post-acute community medicine.

The summary serves as a window to all that happened in the hospital for the clinicians who care for the patient after they leave, mostly primary care and skilled nursing facility physicians, Raghu Subramanian told Medscape Medical News. She and another hospitalist specializing in clinical informatics and digital transformation joined 20 other UCSF faculty as study authors.
Hospitalists sift through all the notes taken by any clinician that interacted with the patient during their hospital stay, she said. “It takes a lot of time.”
High-quality discharge summaries reduce medication errors, lower hospital readmission rates, and enhance primary care physician satisfaction, according to research the study cited.
The discharge summaries contain such elements as principal diagnosis, a medication list, and test results. The narrative sections include the patient’s history of illness and hospital course.
“Unlike a hospital progress note, which often reflects incremental daily documentation effort, a discharge summary can be considerably more involved, particularly for lengthy hospital encounters or when care has been provided by sequential physicians,” the study said.
In a 2021 survey of 815 American physicians quoted in the UCSF study, 44% of hospitalists said they were too busy to prepare high-quality discharge summaries.
LLMs, such as generative pretrained transformer, hold promise in healthcare to save clinicians’ time, reduce burnout and increase job satisfaction, the study indicated.
Raghu Subramanian explained that an LLM can take all the encounter notes extracted from a clinical data warehouse or a clinician can feed the summary of the encounter notes to the LLM to create a new discharge narrative in far less time than it would take a physician.
Potential for Harm
While other studies have tested LLMs to create discharge summaries, the UCSF study involved actual hospital encounters with many patients over several days rather than curated vignettes, according to the hospitalist authors Medscape Medical News consulted.
Study authors thought reviewers should include producers of the discharge summaries, hospitalists, and consumers: skilled nursing and primary care doctors, Raghu Subramanian said.
The reviewers evaluated the summaries generated by hospitalists and the LLM and scored them separately based on their potential for harm; coherence, conciseness, and comprehensiveness; and which set of summaries reviewers preferred.
The LLM-generated narratives had more omissions and inaccuracies than the doctors’ summaries. But they contained a similar number of hallucinations — seemingly plausible but fabricated statements — which study authors found noteworthy considering the “well-documented propensity for LLMs to hallucinate.”
Harm scores based on those errors could range on the low side from no potential for harm to potential for emotional distress or inconvenience, such as mild anxiety, and on the high side to potential for injury or death, the study reported.
Still, most of the harm scores were low, according to Hospitalist Benjamin Rosner, MD, PhD, senior author the study. “In an ideal world, we want none of those to happen,” Rosner said.

“When we think of the potential of LLM a lot of people think it has to be perfect. Maybe that’s the wrong benchmark,” he said. He pointed out that the physician-generated discharge summaries also had errors and the potential for harm.
Many clinicians care for a patient in the hospital, but it is the last physician before discharge, typically a hospitalist, who writes the summary, Rosner explained.
They have to read through every “inherently messy” hospital encounter note written by many members of the care team from every day of hospitalization, he said.
From her outside perspective, Liu noted that most of the errors the study found were clerical, such as missing information a hospitalist might want to relay to the patient’s primary care provider about suggested follow-up care after the patient is discharged from the hospital.
“Hospitalists don’t necessarily type up their day-to-day progress notes as part of the discharge summary.” As a result, both physician- and LLM-generated summaries contained these types of omissions, explained Liu, who recently led AI workshops at national conferences for hospital medicine and internal medicine.
The discharge summaries also may not have included consultation notes, vital signs, lab values, radiology, pathology, and other clinical reports, she said.
In terms of causing harm, what was omitted might have improved the quality of the patient’s care, but it wouldn’t have severely impacted it, Liu said. One discharge summary omitted that the patient should have taken certain antibiotics. “Obviously it would be better to have been included. It was not so critical to cause harm, based on the study.”
Farzana Hoque, MD, an academic hospitalist in Missouri, was more alarmed by the errors.
“In this study, omissions were twice as common in LLM-generated summaries compared to human ones — a significant patient safety concern,” she said. For instance, failing to document that a patient’s doctor should follow up on a lung cancer concern could delay diagnosis and treatment, said Hoque, associate professor of medicine at Saint Louis University, St. Louis.
“Beyond patient safety, such omissions may also elevate malpractice risk for clinicians.”

Rosner said that if doctors make errors similar to LLM, it represents an opportunity for the latter to create the discharge summaries and reduce the documentation burden on physicians and ultimately curb burnout.
“We know LLMs are improving at summarizing.” So there’s a potential for LLMs to move into clinical use and summarize clinical encounters, he said. Clinicians still need to review the LLM-generated summaries, ensure they are accurate and of high quality, and edit them before signing off on them.
Future for Automated Summaries
Armed with the results of the UCSF study, next steps are for health systems to test a functional LLM tool to create discharge summaries in clinical settings, Rosner said. The process could be similar to how clinicians review AI medical scribe transcription technology, which already is being widely adopted, he said.
Large commercial electronic health record vendors are expected to soon release hospital summarization tools like what was studied at UCSF. “We will be assessing it ourselves,” Rosner said.
“It’s time now to study the implementation of those kinds of tools in actual clinical care. We need to prove the quality of the summaries from LLM before we roll it out to scale,” he said. Rosner added that UCSF also plans to pilot its own LLM discharge summary tool.
Raghu Subramanian said researchers will need to test the safety, accuracy, and feasibility of any AI tool developed to create discharge summaries.
“In practice, when I use a tool, I decide how effective, safe, and easy to use it is.” Any LLM discharge summary tool “still needs a very astute, attentive human in the loop. It still needs to be reviewed by clinicians.”
Roni Robbins is a freelance journalist and former editor for Medscape Business of Medicine. She’s also a freelance health reporter for The Atlanta Journal-Constitution. Her writing has appeared in WebMD, HuffPost, Forbes, New York Daily News, BioPharma Dive, MNN, Adweek, Healthline, and others. She’s also the author of the multi-award–winning novel Hands of Gold: One Man’s Quest to Find the Silver Lining in Misfortune.