Two years ago, a pair of 22-year-old friends who met in high school in Michigan found themselves sitting inside Tsinghua University’s brain lab in Beijing, staring down a multimillion-dollar offer from Elon Musk.
The two had just done something unusual for the moment: they built a small large-language model (LLM) trained not on massive internet data dumps, but on a tiny, carefully chosen set of high-quality conversations. And they taught it to improve itself using reinforcement learning (RL), a technique where a model learns the way a person or animal does: by making decisions, receiving feedback, and then refining behavior through rewards and penalties.
At the time, almost no one was doing this with language models. The only other group exploring RL for LLMs was DeepSeek, the Chinese OpenAI competitor that would later terrify Silicon Valley.
The two students, William Chen and Guan Wang, called their model OpenChat, and they open-sourced it on a whim.
To their shock, OpenChat blew up.
“It got very famous,” Chen told Fortune. Researchers at Berkeley and Stanford pulled the code, built on top of it, and began citing the work. In academic circles, it became one of the earliest examples of how a small model trained on good data, as opposed to more data, could punch above its weight.
Then it landed somewhere Chen never expected: Elon Musk’s inbox.
Musk sent an email through what, at the time, was his new company, xAI, which wanted to recruit the students in a multi-million dollar pay package, Chen says. It was the kind of offer young founders dreamed of.
They hesitated. Then, they turned it down.
“We decided that large-language models have their limitations,” Chen said. “We want a new architecture that will overcome the structural limitation of [large-scale machine learning].”
Instead of taking the deal, they left the comfortable momentum of OpenChat behind and pursued something far more ambitious: a “brain-inspired” reasoning system they believed could outperform current AI models.
That decision would lead, two years later, to Sapient Intelligence — and to a model that outperformed some of the world’s biggest AI systems on tests of abstract reasoning. They are confident their model is going to be the first to achieve “AGI,” or “artificial general intelligence, the so-called holy grail in AI research where a machine’s intelligence can match or surpass that of a human in any cognitive task.
Between the two worlds of the arms race
Chen’s path to turning down Musk didn’t begin in Beijing, but in Bloomfield Hills, Michigan, and with a childhood obsession that drove his parents crazy.
“When I was young, I would break things apart and never put them back together,” he said. “That’s what got me started.”
Chan was born in China, raised partly in San Diego and Shenzhen, and eventually sent to attend Cranbrook Schools — a prestigious private boarding school in Michigan — around the time he met Wang, a boy his age who attended a different school but had an equally unusual obsession.
On the first day they met, the two fell into a long conversation about what Chen calls their “metagoals,” the ultimate purpose of their lives.
For Wang, that metagoal was AGI, long before the term became popular. He described it in high school as an “algorithm that solves any problem,” since the terminology didn’t exist yet. Chen’s metagoal was different but complementary: optimizing everything, from engineering problems to real-world systems.
“It was an instant alignment,” Chen said.
Today, the two still ask every single person they hire what their metagoals are.
Chen founded the school’s drone club, petitioned administrators to let students fly quadcopters on campus, and spent hours tinkering in robotics labs. The two were the kids who stayed late, broke hardware, and kept experimenting.
“It was a great time,” Chen said.
When college admissions rolled around, Chen was accepted to Carnegie Mellon and Georgia Tech — the obvious, prestigious paths for a gifted robotics student. Wang, meanwhile, had been admitted to Tsinghua University, China’s elite engineering powerhouse, often described as “China’s MIT.”
Chen visited the Beijing campus, toured the labs, and made a decision few American high schoolers would: He followed Wang to Tsinghua.
The transition wasn’t easy. The coursework was intense, and the two struggled, even flunking some classes.
“Most of the Chinese kids are really — I hate to be stereotypical — but they’re really good at studying,” Chen laughed. “They’re really sharp.”
Still, he was surprised by how supportive his professors were once they learned what he and Wang were building.
“They were like, ‘Hey, I know this thing you’re trying to make — it’s a very good thing. I actually believe in the concept of AGI,’” he said.
By then, nearly everyone in Tsinghua’s Brain Cognition and Brain-Inspired Intelligence Lab knew what the two undergraduates were attempting: a new approach to machine intelligence that challenged the dominant assumptions of the field.
A 3 a.m. breakthrough
It was at Tsinghua’s brain lab where they developed the Hierarchical Reasoning Model (HRM), the architecture they believe can surpass transformers entirely.
If OpenChat was their proof of concept, HRM was the moonshot they had been building towards. And the moment it proved itself came, appropriately, in the dead of night.
On a random early morning in June this year, at 3 a.m., Chen and Wang stared at the benchmark results returned by their small experimental model. Their tiny HRM prototype — just 27 million parameters, microscopic compared to GPT-4 or Claude — was outperforming systems from OpenAI, Anthropic, and DeepSeek on tasks designed specifically to measure reasoning.
It solved Sudoku-Extreme, found optimal passages through 30×30 mazes, and achieved startlingly high performance on the ARC-AGI benchmark — all without chain-of-thought prompting or brute-force scaling.
“It was crazy,” Chen said. “Just with a change in the architecture, it gave the model a lot of what we call reasoning depth.”
Unlike a transformer, which predicts the next word based on statistical patterns, HRM uses a two-part recurrent structure modeled loosely on how the human brain mixes slow, deliberate thought with fast reflexive reactions. The system can plan, dissect problems, and reason using internal logic rather than imitation. “It’s not guessing,” Chen said. “It’s thinking.”
Chen says their models hallucinate far less than traditional LLMs and already match state-of-the-art performance in time-series forecasting tasks like weather prediction, quantitative trading, and medical monitoring.
They are now working on scaling HRM into a general-purpose reasoning engine, with a simple but radical thesis: that AGI won’t come from bigger transformers, but smaller, more efficient architecture. Today’s frontier models are massive — in some cases, hundreds of billions of parameters — but even their creators admit they struggle with reasoning, planning, and multi-step problem decomposition, Chen said.
He believes that limitation is structural, not temporary.
“You can stack more layers,” he says. “But you’re still hitting the limits of a probability model.”
Sapient is now preparing to open a U.S. office within the next month, raise additional funding, and maybe change their name to begin deploying the second version of their model. The founders believe continuous learning — the ability for a model to absorb new experiences safely, without retraining from scratch — is the next major frontier.
“AGI is the holy grail of AI” Chen says. And he expects it to emerge in the next decade.
“One day, we’re going to have an AI that’s smarter than humans,” Chen said. “Guan and I always say it’s like Pandora’s box, if we’re not going to make it, someone else will. So we hope that we’re going to be the first one to make that happen.”
