Gemini Has 650M Users. Now What?
A Conversation with Google DeepMind’s Logan Kilpatrick
The launch of Google’s Gemini 3 Pro last month marked an inflection point in the AI race.
Gemini surpassed 650 million monthly active users and now leads on most performance benchmarks. This progress reportedly triggered a “code red” at OpenAI, a renewed push to compete and improve ChatGPT, and the release of GPT 5.2 last week.
To make sense of this moment, I spoke with someone who’s worked inside both of the companies that now sit at the center of AI’s biggest rivalry.
Logan Kilpatrick is a Product Lead for Google’s AI Studio and the Gemini API, helping translate Google DeepMind’s AI research into tools used by millions of developers worldwide. Before Google, he ran Developer Relations at OpenAI. He is also an active angel investor in AI-native companies, including Cursor and Cognition.
In our conversation, we dig into:
Where Google sees its edge in the model race.
How code generation is unlocking product extensibility – and why every startup now needs to be “code-adjacent.”
Where token economics actually favors startups over big tech.
01 | “We want to be everywhere.”
EO: The latest Gemini release highlights two present advantages for Google – (i) model performance, especially in complex reasoning and multimodal understanding, and (ii) distribution — which is arguably more important.
You all have done a phenomenal job integrating Gemini across existing products like Search, Workspace, and Cloud, plus new ones like your developer studio and API, and the new IDE and agent builder Antigravity. It’s an elegant experience working across them.
Looking at the next 2-3 years, how does Google want to position itself against competitors like OpenAI and Anthropic, especially within developer networks?
Will this remain a direct, head-to-head competition across almost everything? Or do you see some natural specialization over time?
LK: I think there’s definitely things we’ll do that are unique, at least from a model perspective. Multimodal is a great example of this. It’s been a focus since the first Gemini release, and it’s where we really have state-of-the-art capability.
For example, with Gemini 3 Flash we now have something called visual thinking. We’ll talk more about it in January. But it lets you use code execution with multimodal understanding to preprocess images – before analyzing them.
So if you upload an image with poor contrast, and the model is struggling to interpret it, the model can automatically write Python code to adjust the hues or lighting, then re-interpret that corrected version.
That’s one example. But there is an interesting set of things that stem from multimodal, which have been a core advantage for Gemini. That’s all now amplified.
We’re also getting really good at code and agentic tool use. And many other things.
But what about competition with the other model labs?
These foundation models are inherently general. Because of that, you will continue to see head-to-head competition.
Companies are maybe carving out some niches as they go. The Anthropic models have historically been good at code. The OpenAI models have been good at chat. The interesting thing for Google is that we have such a wide set of products. The same model we build for Search is the same model we build for Google Cloud, which is the same model powering parts of the Waymo experience.
A wide range of products means a wide range of customers. And my hope is that we lead in generalization.
You’re seeing that today, and I think you’ll continue to see that in the future – especially as we sim-ship these models across more and more products with each new release. That strategy of tight product integration works well for us.
I was talking to Koray [DeepMind’s CTO] earlier today about this… we’re now seeing much deeper collaboration between our products and models. Historically, DeepMind was mostly research. But we’ve turned a corner. There’s now a lot more collaboration between DeepMind and products like search, the Gemini app, etc.
That was the most impressive part of the launch, for me. The intelligence capabilities felt very native. They enhanced the products I use every day, without changing the overall experience. I’m excited to see how that evolves.
But if I’m hearing you right, it sounds like it will be direct competition across the board. You don’t see a narrative where, say, Anthropic starts to own the enterprise segment, and Gemini owns consumer, because of the full-stack integration with Search and G Suite. You don’t see any of that?
Not for us. We want to be everywhere, and show up for everyone. That’s the challenge for Google, given our breadth and the number of use cases we touch.
Just take the developer use cases. AI Studio and the Gemini API have grown 20-30x over the last year. It’s becoming a substantial business.
The same is true with Cloud. Google Cloud is the sixth largest enterprise business in the world. We’re not ceding enterprise to Anthropic.
What I’d love to see – and Nano Banana [Google’s AI image generator and photo editor] is a great example of this… you can start to see how that underlying edge in multimodal understanding translates into new, state-of-the-art capabilities like image generation and editing.
There’s this cool interplay as you let an advantage like that play out, and see how it manifests into new capabilities and new products.
02 | “Every startup is now code-adjacent.”
I want to test a hypothesis with you. That we’re shifting from a phase of experimentation to one of inference… from “can AI do this?” to “how best to architect these AI systems?”
For example:
Developers are moving from using AI for supplemental tasks (query-response code generation, single API calls) to systems design (building agents with more persistent context and multi-step planning).
Backends are now optimized for AI agents as users, not just serving human-facing applications.
Autonomous DevOps is creeping into the conversation – engineers want to delegate more configuration and deployment decisions to AI (though still early here).
Does that match what you’re seeing? How would you characterize this moment? Anything in your data on developer trends that is especially counterintuitive or surprising?
One top-level trend – we are in what I call the “LLM 2.0” era. If you look at how early-stage startups were building products a year and a half ago – versus the last six months – it looks fundamentally different.
Historically, to get the models to be useful, you had to do all this scaffolding work to ensure that the model had the right guardrails and configuration.
But now lots of companies – especially startups – are ripping out the things they built a year ago and starting from scratch. The model capability is now so good that what you need, what success looks like, and where you’re eking out the performance gain over the base model – that’s all fundamentally different than before.
So at the meta-level, that’s the largest transformation I’m seeing.
But what specifically is different, what’s driving companies to rip-and-replace?
Our customer base in AI Studio and the API is predominantly startups. With those companies, everything is now agent-first.
The other big shift is code. Startups outside of developer tools are now making code generation a core capability, a key value driver inside their product. There’s this interesting trend where almost every startup needs to be “code-adjacent” – because the capability of writing code is so foundational, so applicable across every use case.
When you say “code-adjacent,” what do you mean?
I’ll use a random example. A year ago, a product for financial planners wouldn’t involve any code generation. Today, if you’re building that product from scratch, there’s code being generated behind the scenes – agents writing custom scripts for each planner, each workflow.
And consumer is next. You wouldn’t expect consumers to want code. But actually, cutting-edge consumer products are now generating software on demand, based on what the user wants.
You’re already seeing this behavior in the Gemini app, with generative UI. Code is becoming the underlying mechanism for delivering personalized information and experiences across every product category. That wasn’t true 12 months ago.
So you’re saying the extensibility of products is orders of magnitude better now, and code generation is what’s enabling that?
Exactly.
03 | “We’re going after foundational capabilities.”
Two years ago, people dismissed LLM “wrappers” as too brittle to create real, lasting value.
Today, the consensus is that AI applications actually need proprietary architecture on top of foundation models to work – subsystems for reasoning, perception, memory, and execution.
Across these different subsystems, what are the biggest tooling or middleware gaps? Where are developers hitting walls that suggest the infrastructure just isn’t ready?
For example, I’m still hearing about friction in observability and traceability, weak context management, and limited support for cost control.
Of these, are there any big opportunities for third-party or open-source tooling, things that companies like Google won’t or can’t build themselves?
The observability piece is interesting. There’s been a lot of investment and companies taking a swing at solving that problem. Maybe none of those products have solved the problem yet, but there are a lot of people trying.
Well the hard thing is these models are non-deterministic. There’s no easy way to trace why they made a particular choice. It’s inherently hard to see what’s going on under the hood.
That’s true. And for us, the question of where we will and won’t go… we’re trying to raise the floor for everyone. We’re going after foundational capabilities.
RAG is a good example. In 2024, every developer was spending tons of time on RAG. It was top of mind. So about six weeks ago, we launched a tool that lets developers upload files and query them instantly, no need to build a RAG pipeline yourself.
That works for 90% of developers. Of course, there will always be advanced scenarios where you need to customize and turn every knob. But for most use cases now, you don’t.
That’s emblematic of the categories we’ll go after – foundational components, where Google has a unique advantage from a scale perspective, and we can raise the floor for everyone.
You also mentioned context management… I think there’s something really interesting there. Deep Research is a good example of how we’re starting to address that.
Part of why I love that product (and we have it in the API now for developers who want to build on top of it) is it handles a lot of context engineering for you.
End users don’t want to think about how to get the right information from Point A to Point B, so the model can answer their question. They just want to ask a vaguely-formed question and have the model go find the right context to answer it intelligently.
That’s what Deep Research does. It searches the web, but it also connects to Drive and other services. The user doesn’t have to think about framing the question perfectly or making sure the model has enough context. The agent figures that out.
That’s what I’m most excited about, and I think there are a lot of unique opportunities to make that work better across more products.
04 | “There’s a real advantage to being small.”
You’re an active angel investor in AI-native startups. What’s your theory for where value will accrue for new companies (versus incumbents) as the AI stack matures?
Maybe said differently, if you had to focus capital on one or two startup categories you think will outperform the broader early-stage market, what would they be?
I’m most curious about the underlying rationale. What do you think gives a startup a product edge in those domains over big tech?
I’ll note that I invest outside my capacity as a Google employee.
But the underlying truth of startups – and this isn’t a unique perspective – is that value accrues at the frontier. Larger companies are complex systems, focused on many different things. The advantage startups have in the AI era is the ability to move as fast as humanly possible and go after use cases that don’t quite work yet.
I was on a show earlier today, and the guest before me was the CEO of Wabi, a personal software creation platform. Six months ago, you couldn’t build that product. The models weren’t good enough at generating code. Now they are. This company was taking shots on goal, but now, all of a sudden, their product works and they can bring it to the world.
But is there a category that stands out for this moment? Over the next three years, for example, are you most excited about vertical AI, horizontal workflow automation, creative tooling, etc. What do you think is the frontier right now, given model capability?
I like the Wabi example because the new ways of generating code to solve bespoke problems is fascinating.
There’s another point I want to make, which is that in many AI-native companies, where there’s deep product customization, the economics actually favor startups. Can Google reasonably afford to put personal software into every product, for every customer, all at once? Maybe technically. But at our scale, it’s hard to deploy something that token-intensive across hundreds of millions of users.
Startups get to start small. You have 10,000 users, you build something great for them, you build momentum, and keep raising to fuel that growth. For expensive use cases like personal software – where you’re generating a lot of code per user – there’s a real advantage to being small. You get to take shots on goal without needing massive infrastructure from day one.
I hadn’t thought about the cost structure advantage that way, that’s interesting.
I’ve been most excited about the moment when non-technical users can build complex software.
There have always been two separate categories – users who know what they want, and then the people who can build it for them. Once those categories dissolve, the innovation potential feels very big.
I think it’s happening right now, which is exciting.
We’re on our way! On that note, I’ll let you hop. Thanks for your perspective.
My pleasure.


