Court Rules AI News Summaries May Infringe Copyright

News publishers just cleared a key hurdle against Cohere in a copyright fight over AI-generated “substitutive summaries” of their reporting.

If you’re only keeping a loose eye on the fifty-plus AI copyright lawsuits now crawling their way through federal courts, you may have missed an important new ruling last week out of the Southern District of New York. It didn’t feature photorealistic superheroes, deepfaked celebrities, or anything bright and shiny enough to rack up shares on social media. Instead, it involved something far more mundane—but no less consequential for media organizations: AI-generated reproductions of news articles. And in a case that may help shape how freely AI companies can repackage internet news content, Judge Colleen McMahon held that “substitutive summaries”—outputs that mirror not just the underlying facts but the expressive structure and journalistic storytelling choices of the originals—may plausibly infringe copyright. That was enough to let the claims move forward.

The lawsuit, Advance Local Media LLC v. Cohere Inc., was filed in February 2025 by fourteen major news and magazine publishers including Forbes, Condé Nast, the Los Angeles Times, and The Atlantic. They allege that Cohere—a Canadian AI company behind the “Command” family of large language models—reproduces substantial portions of their works, sometimes near-verbatim, while bypassing publisher paywalls. They also claim that Command generates “hallucinated” content falsely attributed to their brands. Altogether, the complaint identifies more than 4,000 allegedly infringed works and includes 75 output examples that, the publishers say, closely track the structure, sequencing, tone, and expressive choices of the original reporting.

Last Thursday, Judge McMahon denied Cohere’s motion to dismiss (read order here), holding that the publishers had adequately alleged direct infringement, secondary infringement, and Lanham Act violations. Cohere didn’t challenge—at least not at this stage—allegations about training-data copying, retrieval-augmented generation (RAG), or outputs that reproduce verbatim or near-verbatim excerpts of the publishers’ works. Instead, the company zeroed in on the plaintiffs’ “substitutive summary” theory, arguing that any overlapping expression was minimal and that Command’s outputs were nothing more than factual digests.

Judge McMahon wasn’t convinced. “It is not possible to determine infringement through a simple word count,” she wrote; “the quantitative analysis of two works must always occur in the shadow of their qualitative nature.”

When Does a Summary Cross the Line?

The Cohere ruling comes as courts nationwide are grappling with “summary-style” AI outputs. Traditional summaries distill facts from longer works, and copyright doesn’t protect facts themselves. But it does protect the expressive arrangement of those facts: choices about structure, emphasis, pacing, and narrative arc that turn basic reporting into storytelling.

Long before AI, courts recognized that abridgments can infringe when they replicate those expressive elements. In 1999’s Nihon Keizai Shimbun v. Comline Business Data, the Second Circuit held that while defendants had “every right to republish the facts” contained in news articles, abstracts could still infringe when they tracked the original “sentence by sentence, in sequence” using the same structure and organization. Line drawing proved tricky even then: one abstract escaped liability by reporting the same facts “in a different arrangement, with a different sentence structure and different phrasing.” Another avoided infringement by copying only 20% of the original. Still, the court cautioned that “it is not possible to determine infringement through a simple word count”—a principle Judge McMahon invoked twenty-five years later in Cohere.

Recent AI cases show courts applying those same principles to machine-generated content. In an April 2025 ruling in New York Times v. Microsoft, Judge Sidney Stein dismissed claims by the Center for Investigative Reporting (CIR) alleging that Copilot’s bullet-point “abridgments” infringed their articles. The court found that those reorganized and skeletal summaries weren’t substantially similar, qualitatively or quantitatively, to the original CIR articles as a matter of law.

But just two weeks ago, the same judge allowed fiction authors’ output-based claims against OpenAI to proceed, finding that ChatGPT’s narrative summary of A Game of Thrones plausibly crossed into protected expression because it “conveys the overall tone and feel of the original work by parroting the plot, characters, and themes of the original.” In a blog post, copyright scholar Matthew Sag called the ruling “a fundamental assault on the idea expression distinction,” warning that if a 580-word ChatGPT summary infringes a 694-page novel, “thousands of Wikipedia entries” could find themselves in copyright crosshairs.

News articles present a unique challenge: they’re much shorter than novels—making wholesale copying easier—but contain far more unprotectable factual content. The question isn’t whether plot and characters were copied, but whether the journalist’s particular way of presenting facts—the structural choices, emphasis, and narrative flow that distinguish reporting from a police blotter—was appropriated. When summaries adopt those expressive choices, they may cross into infringement territory. Still, determining when exactly that’s happened isn’t always an easy task.

Why Cohere Matters

Cohere is one of the first major decisions to sustain a text-based output copying claim involving non-verbatim news summaries. For months, AI companies have pointed to Judge Stein’s dismissal of the CIR claims as proof that summary theories were DOA. Judge McMahon just showed the door is still very much open.

More broadly, Cohere continues a trend away from the abstract fight over AI training data to the more concrete issue of outputs. Whatever happens with fair use defenses around training, AI companies face real exposure when their outputs too closely mirror protected expression. For developers marketing news-oriented applications or research assistants, the message is clear: “summaries” aren’t a safe harbor but a fact-specific minefield requiring careful navigation.

A Tale of Two McMahon Rulings

It’s worth noting that Judge McMahon has now ruled on both ends of the AI-news spectrum. Last year I wrote about her dismissal of Raw Story Media v. OpenAI, where digital news outlets sued under the DMCA for removal of copyright management information. Without registered copyrights, they couldn’t bring infringement claims, only DMCA claims for CMI removal. And without evidence that ChatGPT actually disseminated their articles, they couldn’t show concrete harm for Article III standing. The case was dismissed.

The Cohere publishers learned from that failure. They came to court with registered copyrights, allowing them to bring full infringement claims. They also brought concrete examples of allegedly infringing outputs, and it was enough to allow their claims to proceed.

The Bottom Line

To be clear, the publishers haven’t won anything on the merits. They’ve simply lived to fight another day. Cohere will have ample opportunity to push back at summary judgment, where substantial similarity becomes an evidentiary question rather than a pleading exercise. Fair use will also loom large, especially for outputs that are shorter, more factual, or meaningfully transformative. But for now, Judge McMahon has made one thing clear: expressive news retellings aren’t immune from copyright scrutiny.

And in an era where AI companies want to position their models as replacements for traditional news consumption and search engines, the question of when a summary becomes a substitute—and when a substitute becomes infringement—is only going to get more pressing.

As always, I’d love to know what you think. Drop a comment below or @copyrightlately on social. And if you prefer verbatim reproductions over “substitutive summaries,” here’s a complete copy of Judge McMahon’s ruling in Cohere.

View Fullscreen


Continue Reading