An very obvious deal
Shopping for the Data Stack Value Realization methodology.
I sometimes wonder if it’s all Slack’s fault.
In 2012, before Slack existed, I worked for a would-be Slack competitor. We sold software in the way that was trendy at the time: People bought licenses to use it. Pay us $15, and one person can use it for a month. Pay us $30, and two people can. Pay us $15,000, and a thousand people can.1 And just as a landlord doesn’t care how much time someone spends in their apartment every month, we didn’t care what people did with our software, or if they even logged into it all.2 In both cases, customers buy timed access. What they did with that access was irrelevant.
When Slack launched, they charged their customers in the same way. According to their first pricing page, “adding or removing team members during the term of a subscription will cause a one-time pro-rated credit or charge on your account.” But then, Slack blew up. The product—and this chart—was suddenly everywhere. And Slack, with their “be kind” brand and CrayolaCore aesthetic, decided that this old pricing model was capital-w Wrong:
Most enterprise software pricing is designed to charge you per user regardless of how many people on your team are actively using the software. If you buy 1,000 seats but only use 100, you still get charged for 1,000. We don’t think that’s fair. And it’s also hard to predict how many seats you’ll need in advance.
At Slack, you only get billed for what you use. So you don’t pay for the users that aren’t using Slack. And if someone you’ve already paid for becomes inactive, we’ll even add a pro-rated credit to your account for the unused time. Fair’s fair.
It was a savvy maneuver, for Slack. People were quickly becoming addicted to their product, and it’s unlikely that many of their customers were buying 1,000 licenses and only using 100. Instead, they probably had the opposite problem: More people wanted to use Slack than companies’ IT departments were willing to initially pay for. By charging for only the licenses that people used, Slack could tell those IT departments that they didn’t need to worry about overspending on Slack, because they couldn’t overspend on Slack. If nobody liked it, then they would stop logging into it and nobody would get charged for anything. And if lots of people liked it and used it all the time, wasn’t that worth a few dollars a month?
It was an ethos that ate the internet. Silicon Valley is an industry of trends, and Slack was our generation’s trendsetter. Shortly after they launched their “Fair Billing Policy,” other startups launched their own imitations. At Mode, our customers, many of whom had recently bought Slack, started to ask for the same thing. “Why should I get billed for something if I didn’t want to use it that month?” people said, for the first time. “That’s not how this should work,” they said of how it always worked.
Software pricing is like that though. When you’re selling virtual ephemera on the internet, there is no obvious way to buy it, and there are no physical or economic laws for how it should work. Instead, it’s almost entirely driven by norms and market expectations:
At first, people bought software by paying for a CD-ROM (or worse) in a box, and got to use it as much as they wanted, forever.
Then, we began to rent access. Rather than installing software, we paid monthly fees to use a website.
Next—in part because of companies like Slack—people started expecting to only pay for the days or hours that they used a product.
And now, prices are even more granular, and are defined by what you do on a website. Though usage-based pricing models have been around for some time—cloud computing services like AWS charged people for how much traffic they handle; storage apps like Dropbox charged for how big of a hard drive you want to rent; web tracking products like Mixpanel charged more if you wanted to track more events—they’re starting to show up in SaaS products too.
These trends were especially apparent in data products. People used to buy databases via perpetual licenses that they paid for once and could use forever. Then, Amazon launched Redshift, which was offered as a monthly lease of a dedicated database. Snowflake shortened the term of the lease—instead of renting a database for a month at time, you leased it by the minute.3 BigQuery then began charging by query, billing customers for every byte they asked BigQuery to process.
At first glance, these steps seem like a refinement of the former model. Each progression adds smaller intervals under the demand curve. Pay for exactly what you use—it is pure; efficient; the markets, clearing.
Which, maybe; I don’t know; sir, this is a Substack, not an NBER paper. But if you spend some time selling all this virtual ephemera on the internet, you’ll probably discover at least one thing: Regardless of the economic theory of each pricing model, there is a psychological discontinuity between model three and model four. In the first three models, it is hard to reason about how much some software service should cost. What is the right price for a license to use Slack? What is a Google Docs subscription worth? How do I put a price on Spotify? These are esoteric, almost philosophical questions. What is the value of corporate communication? Of an infinite library of documents? Of listening to Gracie Abrams4 on a loop for a month?5
We don’t know, so we price software via dead reckoning: What’s fair today is what was fair yesterday. Outlook cost $12.50 a month, so Yammer cost $15 a month, so Slack cost $12.50 a month.6
But in the fourth model, this breaks down. When you’re selling a single unit of consumption, people seem much less willing to accept an arbitrary price. And they start asking one question in particular: “How much does this thing I’m buying cost you?”
Consider: A customer could say to Slack, “It costs you a few cents to add another person to your user database and to store the thousand messages that they send every month. Why are you charging me $12.50 for that?” Historically, people don’t ask that though—or at least, don’t complain about it too much—because that’s not how the world works. Arbitrary monthly licensing fees might not be an economically precise pricing model, but, when everyone is used to paying them, it’s a psychologically durable one. But if Slack started charging per message sent, people would start talking about the egregious markup. “It costs Slack a fraction of a penny to send a message, and they’re charging a full penny for it!” There would be righteous online riots about price gouging.
There is nuance here though. If Slack charged incremental fees for storing files, there would be probably be protests, but tamer ones. Because, again, we’re used to that. Storage is a thing we long thought of a scarce resource; we’ve still pay more for computers with more memory; conceptually, storage feels like a fair expense.
We saw all of these dynamics at Mode. When we charged a monthly licensing fee, our first price was $250 for technical users. People were upset by that—not because of the titanic markup we charged on top of a website that cost pennies to provide to an additional user, but because $250 a month for SaaS software was abnormally high. We eventually changed our prices to about $25 for all users. Then, when we added a fee of a few cents to run an additional query—running a query was our equivalent of sending a message on Slack, and also cost us a fraction of a penny—people were outraged by our brazen margins. But later, when we added computational middleware—“an in-memory compute engine”—that made running queries a plausibly expensive operation for us to perform, people still objected, but most customers were ultimately ok with it.
And that’s the lesson, I’d argue. If you can sell subscriptions to your software, you won’t get asked about margins, but you only have so much flexibility about what you can charge. And if you sell consumption, you better charge for something that sounds expensive. Storing stuff works. Doing a bunch of hard math works. Generating an AI image works. But rendering a website, or calling an API, or managing a database of files and messages—well, “we don’t think that’s fair.”
This is why dbt Labs has always been a fascinating company to me (and why it will be good for that HBS case). Because, stylistically, here is the position that it’s always been in:
They provide a service that lots of people want to use. They built a product, the market wants that product, and dbt Labs quite successfully delivered it to them.
But how do you charge for it? Even if you put aside the dilemma about open source, there isn’t a clean pricing mechanic. You can charge for seats, but not that many people use dbt. You could solve that problem by charging a lot per license, but people start to balk at any seat price that’s more than two figures a month.
Or, you can charge for usage—that is the trend, after all, and people use dbt a lot. But that doesn’t really work either, because dbt hasn’t historically done anything that looks like real work. It doesn’t store data; it offloads all the hard computation to a database (which is already charging people for that exact operation). And if you’re not doing the thing that feels like work, people get mad when you charge them for it.
It’s a very unique set of rocks and hard places: A popular product, without no obvious way to sell it. And for years, I assumed the way out was for dbt Labs to attach itself—via acquisition or a series of white-labeled OEM deals—to the databases that had more direct ways to make money from people using dbt:
Databricks solves the riddle of dbt Labs’ business model. Databricks can offer dbt as a free, unmetered service. It wouldn’t care if you use the open-source version or dbt Cloud, nor would it worry about how many seat licenses you buy. This frees up dbt Labs to focus on what it does best—driving adoption of dbt’s core services.
It’s an obvious solution: if you can’t monetize your own service, find someone who can, and get a shared bank account.
Of course, when it happened, people also said that combining dbt with Fivetran was obvious. It was peanut butter and jelly. It was one-third and two-thirds of a three-letter acronym. It just made sense.
Spiritually, absolutely; both companies are from the same generation; of the same religion; they went to the same high school. Logistically, as a merger to make a one-stop shop for data services—the Atlassian for Open Data Infrastructure, the Adobe for data people, the Bean Counter Cloud—I can see that too. Financially, as a way to combine two IPO-ish scale balance sheets into one; makes sense. Defensively, as a means for creating a business big enough to stand its ground against empire-building companies like Databricks and Snowflake; sure, why not?
But, as a solution to the pricing problem—the core dbt problem—that story seems harder to tell. Fivetran can’t make money off of the queries that dbt generates, nor can dbt transform their popularity into more volume for Fivetran. The two businesses are loosely synergistic: They indirectly help one another by, because Fivetran brings more raw data to dbt and dbt makes Fivetran’s raw data more useful, but that was true when they were independent.
Which doesn’t mean the deal doesn’t make sense—those other benefits are there, and it gives both companies one less potential competitor in a compacting industry. Still, it’s not quite 1+1=3, and M&A bankers love 1+1=3.
There are a couple obvious options though. One is for the combined company to use its weight and position as the data department store to become the bully in the industry:
fivetran and dbt, the two largest players outside the data warehouses, are merging to flip the script: move value capture back into business logic. what is business logic? it’s the finite set of if/else statements that define your business. sql pipelines are endless conditionals that say “if customer did x, then calculate y, and route to z.” those transformations, rules, definitions of what revenue means and who counts as an active user and how to segment customers - that’s your actual business encoded in code. …
[Snowflake and Databricks convinced everyone] that compute should capture all value, that business logic should be free. … by merging, [Fivetran and dbt Labs are] trying to have more firepower to make compute cheap and commoditized, and move value capture back where it belongs: business logic.
I guess it could work? But, this doesn’t solve the psychological issue: What, exactly, does dbt charge for? You can’t bill for lines of business logic written. I suppose you could charge for a giant platform license—but that is the way we sold software decades ago. The trend is towards charging for consumption, and databases feel like they have more right to charge for that than SaaS applications that use databases.7
A second option is for dbt and Fivetran to fill in the hole in diagram, and become a database. Which, also, could work too? But that’s a big risk. Snowflake and Databricks aren’t huge companies because they figured out that there is money in building a giant enterprise database; everyone has know that for decades. They’re huge because they actually pulled off the very difficult thing of building a giant enterprise database. There is a big difference between a clever idea and a hard idea, and building a database is very much a hard idea.
Still, perhaps there is a third option—stolen, in true Silicon Valley fashion, from our latest trendsetter.
Here is one way to think about how Cursor works:
They built a very popular app for writing code with AI. Initially, most of the work that Cursor did was actually done by Anthropic (or OpenAI, or Google, or whatever): You told Cursor what code you wanted written, Cursor would wrap some prompts around your request, send it to Anthropic (et al), and Anthropic wrote the code—i.e., it did the thing that sounds expensive.
Cursor became very popular. But its strategic position was somewhat unsustainable, because all the money that customers spent to write code went to Anthropic rather than Cursor.
In doing this, Cursor noticed a pattern. A lot of the requests they were sending to Anthropic were relatively simple. They could be solved with simple models, and didn’t require giant state-of-the-art LLMs.
The opportunity, then, is straightforward enough: Raise a bunch of money and make a mini-model that handles the simple stuff. Don’t compete with Anthropic exactly; instead, just use Anthropic for less. Make it about saving customers’ money by choosing more efficient ways to write code, and then charge your customers when they invoke your high-volume, low-cost models.
The analogy is obvious. Cursor is to Anthropic as dbt is to a database. dbt is the interface; the database “does the work.” dbt has people’s attention; the database gets the money.8 And—most notably—most of the queries that people run are small, and don’t require giant databases to execute.
So, you know. Put a database underneath dbt Fusion. But a small one; one that’s not about competing with Databricks and Snowflake, but about saving customers’ money by choosing more efficient ways to run queries. It’s about doing what’s more efficient. It’s about Fair Pricing.
If only there was a company that was about Small Data.9
Well, no, that’s not quite right. Most SaaS software vendors offer discounts if you buy in bulk, so if you paid us $15,000 a month, you’d probably get something like 1,500 licenses. And if you paid us enough—say, $100,000 a month—we’d probably give you an unlimited number of licenses.
That’s not quite right either. We also cared a lot about how much people used a product, but it wasn’t because we charged for usage; it was because unused licenses were unlikely to get renewed. If we sold 1,000 licenses and only 100 people used it, most customers (though not all! You’d be surprised!) would notice that, and wouldn’t buy 1,000 licenses again next month or year. And the entire economic apparatus of a SaaS business depends on customers buying more services every year, not less.
Strictly speaking, Snowflake doesn’t charge for consumption. They charge for the amount of time you keep the database active; what you do while it’s active doesn’t matter. It’s similar to Slack in this regard—they refined the terms of the subscription, but didn’t charge for actual activity.
How philosophical is SaaS pricing? Slack’s guide to understanding the value of Slack includes the following phrases:
“What is value…?”
“...a value-centric partnership…”
“...we have a dedicated Value Realization team…”
“...the Slack Value Realization methodology…”
“Build a value map”
“...value stories…”
“...value strategy…”
“Start your journey toward becoming a value expert”
And, in a telling coincidence, after all of that, uh, rigorous study, Slack found that the correct price of their revolutionary new communication platform was…exactly the same as as what their older competitors charged.
Making compute a faceless commodity doesn’t really solve this either. All that does is drive the margins out of the database; it’s not clear that it pushes those margins back into the software that’s on top. “We own more of the pie by shrinking everyone else’s slices of pie” is great for customers, but not so great for anyone with pie.
Cursor charged their customers and then paid Anthropic to execute their requests, whereas dbt connects to customers’ databases, who then bill customers directly for dbt’s usage of the database. The result is the same though; the money ends up with Anthropic and the database, not with Cursor or dbt Labs.

I mean, just put an invisible duckdb underneath and call it a day.
People don't like to admit it because it doesn't look good on an architecture diagram, but most BI workloads in Tableau, PowerBI and others are using local database extracts because they use super fast analytical databases (Hyper & Vertipaq) and are essentially free with the licensing you already pay your BI vendor. They are faster and cheaper than a metered billing database. OMG, you are copying data from the central repository??? What kind of antipattern animal are you?? This creates uncontrollable data sprawl. You should connect your dashboards live to our expensive database and 50x the cost of running a dashboard. Your users will get used to their analytics taking 30s to load instead of 5!!!
Charging for cheaper compute is the obvious next move here. You hit the nail on the head.