When Nathan Sobo set out to build the Zed code editor, he had a simple but ambitious goal: Make it fast and fun.
Zed Industries (Zed, for short) is building what it expects to be the world’s fastest text editor for hyper-responsive and collaborative coding experiences. It offers tools for native debugging support in multiple languages, agentic editing and edit prediction, with more features planned on the roadmap.
The Zed code editor and its AI capabilities are all open source, including Zeta, the open large language model (LLM) that powers Zed’s edit prediction.
Performance Is Key
“You hit a key or interact with the app in any way, and we have pixels available for you with zero perceptible lag,” Sobo told The New Stack. “Zero perceptible lag means we need it on the next refresh of your monitor.”
That kind of performance seemed impossible in an era where most code editors are built on web technologies. But Sobo, who previously led GitHub‘s Atom project, knew exactly why that approach would not work. “You’re not going to do that with JavaScript ever,” he said bluntly.
Yet, three years after founding Zed Industries in 2021, the company has built a Rust-powered IDE that’s attracting tens of thousands of developers with its combination of native performance and real-time collaboration features. But as AI capabilities became “table stakes” for developer tools, Zed faced a new challenge in how to deliver AI-powered code completion that matched its zero-latency philosophy.
The solution came through a partnership that transformed its inference infrastructure in just one week, Sobo said.
The AI Integration Challenge
By 2024, Zed had built a solid foundation for the company’s performance-first editor, but it knew AI features were becoming essential. “New features coming from language models and the revolution in language models are becoming pretty much table stakes and expected as part of the experience of what it means to write software,” Sobo told The New Stack in an insightful exchange.
The company developed “edit prediction” — an AI feature powered by its Zeta model that anticipates what developers want to do next and suggests changes in real time. Built on the Qwen series of open source models, the system uses a technique called speculative decoding to optimize for the specific case of code editing.
“The idea of an edit implies most of the text is unchanged, but a few parts might be changed,” Sobo explained. “What speculative decoding does is it says every time we get a run of tokens out of the model that matches something that’s in the input, why don’t we just assume that we’re going to keep matching it a little bit longer, and run ahead.”
However, with its launch set for just a week away, Zed’s existing inference provider was not delivering the performance the company needed, Sobo said.
They were missing critical targets: P90 latency under 500ms and P50 under 200ms. Even worse, the provider offered limited compute capacity, no multiregion support, and what Sobo calls a “black box” approach that conflicted with Zed’s open source values.
“They were very kind of turnkey in nature,” Sobo said.
With open source as one of its core values, Zed wanted more visibility into what was driving its model performance so the team could grow its own expertise and find iterative improvements, he noted.
Engineering-First Partnership
Enter Baseten, an inference platform provider that took a different approach. Instead of a hands-off service, it deployed forward engineers directly to Zed’s problem. Within days, the team had tested over 75 different performance optimizations.
“Baseten showed up with an outstanding level of engagement,” Sobo said. “They led with engineering and led with competence. … Watching them traverse the curve from where they were, which is like, ‘Hey, I love you guys as people, I love how you’re showing up, but the number isn’t good,’ to ‘trust us, we will make this number, we’ll move it where it needs to be in time for your deadline.’ And then seeing them do that, that’s pretty dope.”
The technical solution included the TensorRT-LLM framework instead of vLLM as the inference framework; KV caching and custom-tuned speculative decoding to massively reduce latency; lookahead decoding for higher throughput; and multicloud capacity management with geo-aware routing. The team also custom-tuned autoscaling settings to ensure optimal resource utilization while maintaining low latency.
“Getting their direct, hands-on help actually getting this model that we developed deployed doing speculative decoding was super key,” Sobo noted. “We’re very CPU people. When we touch GPU, it’s to run graphics shaders, right? Getting this model running on hardware — that was not our core competency at all.”
The Results: 2x Performance Improvement
The partnership delivered results that exceeded Zed’s initial goals:
- 45% lower P90 latency
- 3.6x higher throughput
- 100% uptime
- Over 2x faster edit prediction compared to its previous provider
In addition, the migration was seamless, Sobo said. Because Baseten maintained OpenAI compatibility, Zed moved all its traffic over within a single day with no code changes required.
Moreover, the performance improvements did not stop at launch. Baseten’s team continued iterating, eventually shipping a custom “Baseten Lookahead” decoding method that shaved hundreds of additional milliseconds off prediction times.
Beyond Performance: A Philosophy Match
For Sobo, the technical achievements were only part of the story. The partnership was also a lesson in how companies should work together.
“You have this finite amount of time on this planet. And who do I want to actually spend that time interacting with?” he said. “These guys just showed up in a way that made me feel like I like these guys, like I actually want to do business with them. And that was the overriding concern.”
This philosophy extends to how Sobo thinks about the entire developer tools landscape. While Zed competes directly with VS Code for individual developers, Sobo has even bigger ambitions: transforming how software teams collaborate.
“Git is literally a tool that Linus Torvalds designed to manage patches that were being mailed to the Linux kernel mailing list,” he pointed out. “It’s literally a tool designed around email.”
Zed’s vision involves real-time, fine-grained collaboration that goes far beyond traditional commit-based workflow, Sobo said. For instance, imagine highlighting code and instantly connecting with whoever wrote it or having persistent conversations that survive code changes and refactoring, he added.
“So, the vision is long-term to eat into the software collaboration market, but the beachhead we’re trying to get is just making developers love using our tool once we’ve claimed that real estate,” he added.
The Rust Foundation
None of this would be possible without Zed’s foundational choice to build in Rust. When Sobo started the project in 2018-2019, he saw no alternative for the kind of system-level performance he demanded.
“You need a systems programming language to implement something of the complexity of an editor if you want it to actually be responsive,” he explained.
“So, with Zed, after the experience with Atom and building on web technology, building a web page masquerading as a desktop app — which is what all these Electron apps are — I knew that web tech was never going to get me to the level of responsiveness that I really wanted.
“You’re not going to do that with JavaScript, ever. Like, you need the control over the memory, multithreading, shared memory across multiple threads, and what Rust offers is the ability to do all of that while maintaining a lot of the productivity advantages of some of these more managed languages, like JavaScript — slower to develop than TypeScript, I’m not going to lie, but the result runs at a speed that would not be obtainable. And it’s definitely faster to develop Rust than C++ or C for us, it rules out large categories of things that could go wrong statically.”
Zed has an open source code base with lots of people contributing “all over the place,” Sobo said.
And “Doing that in a world where you can implement a use-after-free or there’s no compiler support for avoiding some of these very difficult-to-debug, potentially very dangerous [bugs] … would have rendered Zed, with a small team, impossible to build, whereas Rust made it possible.”
AI Without Intellectual Laziness
As AI tools become more powerful, Sobo has developed views on how to use them effectively. He said he is a heavy user of AI for coding, often switching between models for different tasks — using lighter models for codebase exploration, then switching to more powerful models for complex implementation work.
But he’s wary of what he calls “intellectual laziness.”
Said Sobo: “If you could literally enter three prompts and do the dishes while this thing’s vomiting this thing out, is it that valuable? Like, do you even want this? Or what can you learn from this and then literally throw it in the trash and start again.”
This philosophy reflects his approach to building Zed itself — use every tool available but never lose sight of the goal of building better systems.
What’s Next?
Meanwhile, with its AI infrastructure optimized and a growing user base, Zed faces the challenge of any developer tool: feature completeness. The editor still lacks Windows support, which has generated constant requests from users. And the company is working to expand the tool’s debugging capabilities and other IDE essentials, he said.
Still, Sobo said he remains focused on the longer-term vision of transforming developer collaboration. The company’s recent work has revived early collaboration features, with plans for persistent code permalinks, real-time shared editing and what Sobo calls “metadata layers” on top of code that can capture human and AI conversations.
“There’s this sense of, like, the code is where we hang out together, kind of like people that are friends on ‘World of Warcraft,’ like they hang out together in their whatever ‘World of Warcraft’ universe,” Sobo said. “I’ve never played it. For me, my ‘World of Warcraft’ universe is the Zed code base, where I hang out with my friends and talk about the code, talk about the changes,” he said, describing how Zed’s distributed team already works. The goal is to scale that experience to the broader developer community.
The Baseten partnership proved that when it comes to performance-critical infrastructure, engineering-first relationships can deliver transformative results.
YOUTUBE.COM/THENEWSTACK
Tech moves fast, don’t miss an episode. Subscribe to our YouTube
channel to stream all our podcasts, interviews, demos, and more.