Below the Waterline
A Conversation with TextQL Founder Ethan Ding
Product quality used to express itself on the surface, through aspects that were visible to the end user – an intuitive interface, an elegant interaction model, a clean API.
But AI products are like icebergs. Product taste now shows up below the waterline, further down the stack. For example, in how context is structured and retrieved, or how agents are fine-tuned and guardrailed.
That’s why two competing products can feature the exact same interface and tap into the exact same LLM. Yet one feels like magic. The other feels useless.
Ethan Ding thinks a lot about how product needs to be built below the waterline.
Ethan is the CEO of TextQL, an AI-native analytics platform used by large companies like Dropbox and Blackstone.
Traditional business intelligence requires specialized teams to pipe massive amounts of data into a warehouse, then run SQL queries and build dashboards on top. TextQL skips all this. The product connects directly to wherever data already lives, letting any business user ask questions over that data in plain English and get accurate answers in real-time. No migration, no manual SQL, no waiting on the data team.
Ethan also publishes his thinking, with sharp, unvarnished takes on trends in business and technology. He has a knack for connecting the dots between technical architecture and business strategy – how a system’s engineering dictates its unit economics, and how those economics shape competitive positioning.
Ethan and I volleyed about the fate of the app layer, what product excellence looks like at this moment in technology, and what enterprise buyers really care about / what that implies for the demand curve in AI.
01 | “Every AI app is in a race against time.”
EO: Let’s start with your recent essay “Cursor’s Warchest, xAI’s Redemption.” It raises interesting questions that many AI companies will eventually need to wrestle with.
You argue Cursor got squeezed by its model supplier. Anthropic was charging Cursor more per token than it was charging its own users for Claude Code. That put pressure on Cursor’s margins. So their only viable path was to sell to a sponsor with the compute resources to subsidize their inference. In this case, xAI.
I agree with that for Cursor specifically. But then you go further: “the application layer of ai doesn’t get champions. it gets wards. and every ward at scale ends up with a sponsor whose name is on the cap table when you decide independence isn’t underwritable.”
Is the implication that every AI app is beholden to its model provider?
ED: Not universally. This applies to apps where COGS represent a large share of revenue at scale. Coding apps like Cursor or Cognition, for example.
These companies have distribution and user love, but rent their inference. The hyperscalers and model labs have the land and metal to sell that inference, but often lack the intense user pull around their AI products.
Each side has what the other needs, which creates a strong incentive to consolidate.
Can you put your company, TextQL, into that framework? You own the agent harness and execution environment, but you don’t own the LLMs or data warehouses that power your product.
I think margins in AI come from two things – low COGS as a share of revenue and high switching costs.
TextQL sells primarily to very large enterprises, like health insurance providers and publicly traded banks. That means our most valuable asset isn’t necessarily the product. It’s the relationship with the CIO.
CIOs at these mega companies build relationships with two or three new startups a year, and vendor turnover is very low. Our strategy is tilted toward being a P01 in these massive companies. Once we’re in, high switching costs protect our margin.
So the fate of an AI app hinges largely on switching costs, how easy it is to substitute out the product?
A bit, yeah. Right now, every AI app is in a race against time.
AI apps with short procurement cycles, like Cursor or Lovable, are racing for scale. They need to get enough revenue, quickly, to access a cost of capital that lets them trade pound-for-pound with the hyperscalers. Anthropic ran that race and cleared it.
Long-procurement players, like us, are racing for mindshare. There are only five hundred Fortune 500 CIOs. Own enough of their attention, and you get bargaining power and durability.
Does AI make it harder to win mindshare? There are so many more products now...
Certain forums are more noisy, like LinkedIn, G2, and Product Hunt.
But CIOs aren’t buying off LinkedIn. They go to three vendor dinners a month. Two are with existing 9-figure partners. The third happens off a vibe – someone gets my Substack in front of them, we catch them at a conference with a 15-second pitch that hits a strategic priority, or someone they really trust burned their once-a-decade recommendation card to shill for us.
So what’s your 15 second pitch, something that can fire up a CIO to want your AI?
Something to the effect of… I know you’re in the middle of a nine-figure deployment with Capgemini, or some GSI, to take a legacy system like Teradata or DB2 to the cloud. It’s an insane amount of work. Every day the thing is on fire. I can give you an off-ramp… data analytics without the vendor lock-in.
With AI driving up IT spend, CIOs are now terrified of vendor lock-in. Every CIO is looking at projected inference costs and asking – where will that money come from? They’re thinking about alternatives across every major line item, so they're not beholden to one inference provider or data provider.
But functionally, your product sits on top of warehouses like Snowflake and Databricks. So how do you reduce reliance on these platforms?
Today, we’re the only product that can load a million rows each, in the same session, from different warehouses like Snowflake, Databricks, and ClickHouse, or directly from systems of record like SAP and Salesforce.
If your data lives in Salesforce or SAP, you don’t need to migrate it to a warehouse to run analytics. We can query those systems directly, run the ETL2 on the fly, and return answers. No need to pay a separate data vendor just to utilize that data.
If your data is already in a warehouse, we move the analytical work from their engine to ours. You still pay a Snowflake for storage, but we replace the query and ETL layer, which is where customers end up spending a lot of money.
02 | Keep context simple
Many AI apps pitch the context layer as a defensible part of their product. But under the hood, I think most of the architecture being built is largely undifferentiated, and pretty easy to rip and replace.
Context, what you call ontology, is a component of TextQL's product. Walk me through how you built and think about this layer.
Candidly, I don’t know what most startups mean by “context layer.”
It’s usually either markdown files paired with good agentic search,3 or a knowledge graph built on something like Neo4j.4
I think graph databases for memory are a Ponzi scheme. People like them because they visualize memory. But that’s not useful, it’s fake work porn.
Our version of context is a Git-controlled, RBAC directory with two file types – markdown and post-processed JSON.5 In other words, just text files, in a version-controlled folder, with role-based permissions dictating who can edit what.
That’s the whole architecture.
Why JSON? Business analytics needs one source of truth for what every metric means – for example, what counts as churn, how to aggregate it, what to exclude. Every BI tool – Tableau, Looker, dbt – has its own format for doing that. We built one that unifies all of them into a single, compact representation our model can read.
RBAC matters because the same metric means different things to different people. Finance defines "revenue" under GAAP. Sales might define it on a consumption basis. The agent has to know whose definition to use for any given query.
We built it this way so agents can natively read, modify, and save the files, encoding what they learned so the next similar query runs faster. Changes are reviewable and reversible by users or the relevant business owner.
Do you think this context architecture is a source of defensibility for your product, or just table stakes?
What it does is make each query faster and more token-efficient than the last.
We work out-of-the-box, but the first run is slow and burns a lot of tokens. The first time our agent hits a fresh database, it explores from scratch – listing every table and column, then running through them one by one to find the right answer. Our agent might make 30 or 50 tool calls.
Next time a similar question comes in, the relevant tables, columns, and metric definitions are already encoded in our ontology. Over time, that documented knowledge drives down cost per request.
And as cost per request goes down, we tend to see usage volume grow. More users start asking more questions.
That’s cool. You’re seeing unbounded, infinite demand for business analytics. And a flywheel where more usage builds more context, which lowers cost, which drives more usage.
Right. And I’m dancing around your defensibility question because it depends on whether we hit AGI.
Some people in San Francisco believe we’ll have recursive self-improvement6 in a couple years, that soon an agent can one-shot build AWS in an afternoon. In that world, no software system is defensible.
Personally, I’m medium short AGI. In a non-AGI scenario, I think our product has about as much defensibility as Okta, or any SSO provider.
There are many threads to pull here. But let’s take a quick tangent… Why are you short AGI?
The smartest product people I know – Dax at OpenCode, Kari at Linear, David at Sentry – generally agree coding agents aren’t making their products better at a faster rate. And what is engineering productivity, if not improvement to product per unit time, per dollar?
The labs claim recursive self-improvement is three years out. But they have every incentive to say that.
Yes, models are getting better. Token consumption will continue to go up. But it feels like we’re asymptoting to intelligence that’s only as good as the smartest human.
That gets us a lot of productivity gain, but nothing close to the AGI capabilities that the labs are forecasting.
Interesting. I think about it a little differently. Recursive self-improvement needs a way for the system to grade its own work and try again. Today, the tech only works well on verifiable problems, where you can write a function that scores the output automatically.
Coding has a lot of those… Did the code compile? Did the tests pass? Agents can recurse on those sub-workflows, because the success criteria are measurable and internal to the system.
But that’s different from an agent intuiting what a sophisticated product looks like from a spec and building it. There, the criteria for “good” lives outside the system – in user feedback, in markets, in the taste of the product owner.
Agents are getting faster at the verifiable parts. But maybe that’s not the real bottleneck.
I agree. Maybe framed differently… I don’t think Jira would be a better product if you gave Atlassian the firepower of 100,000 more engineers.
Exactly.
OK, tangent done. One last question on ontology… How do you balance good system hygiene and efficient retrieval – minimal redundancy, clean structure – as usage scales?
Redundancy is very underrated in agentic software. Our architecture actually rewards redundancy. In our ontology, concepts cluster in the same folders, files overlap, and the same idea often gets described from many different angles.
Our architecture operates a lot like how Google Maps arrives at a source of truth.
Google Maps doesn’t deduplicate or merge reviews that say the same thing. Redundant reviews actually give users richer context to learn about, say, a restaurant and decide if they want to go. And with enough volume, the full distribution of reviews reflects reality.
03 | How to beat an incumbent
Every AI startup has to take obsolescence very seriously, more than ever before. Models keep improving, and incumbents are shipping agents on top of their entrenched data and workflows.
In your category, Snowflake has Cortex and Databricks has Genie. Both answer natural-language questions over data. These products are single-warehouse and still improving on accuracy. But they’ll keep investing to make these products better, and they own the source of truth your product runs on.
Is there an architectural reason Cortex or Genie can’t catch you?
It’s a good question. You won’t see a massive performance difference using TextQL instead of Databricks – if all your data already lives in Databricks. But you won’t have to pay their markup for analytics.
Databricks and Snowflake built their businesses on captive storage. Put your data into their system, every query goes through their engine, and they charge a 9x mark-up. That gets too expensive when customers are expecting to run 6-7x more analysis over the next couple years.
Our unit economics are more efficient, so agents can hit those databases over and over without blowing out the bill. For Databricks to compete with us, they’d have to cannibalize their own revenue by reducing their take on analytics workloads.
What if they do? How do you win if they start competing on price in order to retain control over the analytics layer?
We have to build up enough critical mass, to win on volume. Because volume buys you the right to compete on price.
Our platform is built to surface low-hanging use cases and operationalize them fast. If our agents have a positive expected value, demand will grow. Similar to how Robinhood, by making it easier to trade stocks, encouraged people to trade more.
We have to get enough volume before the incumbents copy our model. Again, it’s a race against time.
It’s interesting. Every founder has a different theory of how to play the incumbents. You’re in the economies of scale camp…
Progress in tech means whatever you invent today becomes a commodity tomorrow. To stay ahead, I think you have to be the biggest dog in the fight.
What we have going for us – and what any founder taking on an incumbent should think about – is that we’re aligned with the hyperscalers.
Snowflake and Databricks sit between the hyperscalers and the customer and take a tax on the underlying compute. Their business is a markup on AWS. We don’t monetize storage, meaning our model drives more compute and storage to the hyperscalers.
Five years from now, I could see us on a porch, reflecting on how TextQL won. We would have delivered lower unit prices and higher volumes for analytics in a way that is synergistic with the hyperscalers. Those hyperscalers tipped the scales in our favor, obliterating the neoclouds, and we hung on for dear life as AWS dropped us into more accounts.
But that’s just one possibility.
04 | A framework for product excellence
You’ve written about the importance of product taste. The last 5-10 years of SaaS produced some general principles for product builders:
Opinionated defaults over lots of upfront configuration...
Feature and information restraint...
Standardization across the product, even for very different users...
What’s your framework for the AI era? As products are more non-deterministic, malleable, and cheaper to make, what do you think will separate exceptional AI-native products from the rest?
Man, I wish I had a prepared answer for that! I’m trying to think about what the world rewards now that it didn’t before, or vice versa.
Anything MECE or overly structured – where every input has to fit a neat category – will get heavily penalized. Both as a UI pattern and as a data model. Language models parse volume and redundancy just fine. You don’t need to pre-structure the inputs for them. I’m short any product built around forms or approval flows, and long anything that lets users offload content through text dumps or free-form notes.
I’d also look for hyper-personalized search. The same query from different people should return different results – not just based on who you are, but on what your whole team searches and uses, almost like collaborative filtering. The right architecture probably blends something like Cursor’s pre-built embedding index7 with Cognition’s on-the-fly parallel search.8
And last principle… everything should be configured as code. Every button click should map to something programmatic on the backend, something an agent can read, modify, and write.
Another framework I’ve been using is to look for a high ratio of non-deterministic to deterministic operations inside a product.
Most teams over-index on determinism because it’s safer and more predictable. But too much scaffolding limits what the LLM can do and can create technical debt as models improve.
I like products that get creative about pushing the edge of non-determinism. Where the deterministic parts are there mainly to feed the intelligence, in service of giving the LLM the best possible surface to reason over.
Yeah, I like that. That also made me think – I’m short specialized sub-agents.
Why is that? I think that’s a bit of a contrarian take. I’m seeing so many products use an orchestrator that farms work to sub-agents specialized for particular tasks.
The more customization that goes into a sub-agent, the less I like it.
Claude Code spins up sub-agents for parallel reads – five agents digging into a codebase and coming back with findings. I don’t count that architecture. That’s more of a search method than a sub-agent workflow.
But architectures where one sub-agent is optimized for research, one for analysis, one for writing – that’s losing the plot on the bitter lesson, the idea that general training beats methods that encode specialized human knowledge.
Too much agent customization, and you’re hand-engineering specialization at the exact moment models are getting better at handling everything in one token stream.
05 | Hot takes
Last question. You launched TextQL in 2022, and have had to navigate a lot of fast-changing dynamics.
Two reflections to close on…
One – what’s a thesis you held early in TextQL’s life that you’ve since changed your mind on?
I’ve done a complete 360 on evaluations. When we started TextQL, I thought evals were valuable, that we had to rigorously measure output and optimize against that.
But then we abandoned evals, which let us scale more quickly. Early in the product lifecycle, evals are overhead. The product is changing too fast. Shipping, listening to customers, and iterating is more effective.
As the product and market matures, evals start to matter again. At large token volumes, they help optimize cost-per-query, accuracy, or latency by providing systematic feedback.
Two – what’s something most people believe right now that will look patently absurd in three years?
That demand for AI will keep taking off indefinitely. In three years, I don’t think the big AI labs will still be growing 10x annually.
Why not?
There’s a ceiling on total willingness-to-pay in CIO budgets, and we’re not too far from it. Few people in San Francisco see the ceiling because they don’t spend time with the actual people writing the checks.
Yeah, inference is growing fast right now, but in the grand scheme it’s still too small to register. At some point, it has to translate into real and quickly felt ROI, or it dries up.
That’s right. We’ll see how it all plays out.
P0 is engineering shorthand for priority zero, the highest priority bug, one that has to be fixed before anything else ships.
ETL (extract, transform, load) is the standard process for moving data from one system to another. You pull it out of the source (extract), reshape it into the right format (transform), and drop it into the destination (load).
Rather than building a fancy retrieval system, this method involves writing the important context down in plain text files and letting the model navigate, interpret, and curate them on its own. Markdown files are plain text files with light formatting. Agentic search means letting the AI work through them the way a person would — opening files, searching for keywords, following references, taking notes, and reorganizing as it goes.
A knowledge graph is a way of storing information as a web of connections rather than as rows in a spreadsheet. Each entity (a customer, a product, a deal) is a dot, and the relationships between them (this customer bought that product, this deal belongs to that account) are lines connecting the dots. Neo4j is a popular open-source tool for building these.
Git-controlled means every change is tracked and reversible. RBAC (role-based access control) means permissions are tied to a person’s role. A sales rep sees different things than a finance lead. JSON is a structured data format; post-processed means it’s been cleaned and shaped for the model rather than dumped in raw.
Recursive self-improvement is the idea that an AI system could improve its own code or training, then use that better version to improve itself further, and so on.
Cursor’s pre-built embedding index works by scanning a codebase ahead of time and converting each chunk of code into a numerical fingerprint of its meaning. When you ask a question, it matches your query against those fingerprints to surface relevant code instantly.
Cognition’s parallel search skips the upfront index. When you ask a question, it fires off a flurry of keyword searches across the codebase at the same time — the way a developer would grep for terms, except multiple at once — and uses the results to find what matters.


