From the time that I started this blog in early 2021 through November of 2022, 48 percent of the posts I wrote were about the modern data stack. Ideally, I’d say that those posts were inspired by a passion for modern software, or data technology, or stacks. That I was, as Richard Feynman advised, studying hard what interested me, in the most undisciplined, irreverent and original manner possible. That I had other work to do, but for several years, the only thing I actually wanted to do was analyze Buffy the Vampire Slayer the economic and technical dynamics of cloud-based data tooling startups.
But none of that would be quite true. The real reason I barfed up predictions and complaints about cleverly branded computational corporate middleware every other week for two years was because other people were doing the same thing. The ecosystem around the modern data stack was a club, and the club was popular.1 The club was full of good gossip.2 The club was where people were having fun and getting rich and becoming popular, and I wanted all of those things too.
Today, as lots of people have noticed, that club isn’t what it used to be. On November 30, 2022, ChatGPT came out, Silicon Valley lost its mind, and the tech world’s simmering interest in generative AI became an immediate obsession. The modern data stack—as a collection of tools, as a philosophy, as a brand, as an object of interest on the internet, as an era of launches on Product Hunt—became the millennial’s Hydro Flask to the zoomer’s Stanley: Out of date, and out of demand.
And I, ever the aimless moth to the brightest light, followed the mob. Before November 30, I wrote one post about AI. After, I wrote eighteen. In the last thirteen months, 31 percent of this blog has been about AI—and even more tellingly, only 29 percent of it has been about the modern data stack.
My attention drifted because I, like a lot of the people who were at this party, was never fully here for the right reasons. I wasn't here to find true love in directed acyclic graphs or data observability; the modern data stack was never the one man for my whole life. Instead, writing about the modern data stack was a way to get famous, and the way my manager told me to promote my single. The draw was its popularly, and the audience and attention that came with it.3
Now that audience is largely gone. The stars of the Bachelor are off the air, pivoting, and consolidating. And the cool kids are moving on to dating shows about AI: The headline on Snowflake’s website says “AI/ML is easier in the data cloud;” on Databricks’ homepage, it says “Your data. Your AI. Your future.”
For thrill seekers and clout chasers, be they bachelors looking for fame, opportunistic founders looking for a quick payday, or yellow journalists looking for clicks, there’s no reason to be here without the audience. The party was the point.
But for people looking to build durable businesses, the hype has always been a double-edged sword. On one hand, it brought a lot of money and innovative people. Companies were able to raise huge amounts of cash to build a bunch of new stuff, and customers were interested in trying it all out. We built SQL-powered BI, and search-powered BI, and dbt-powered BI, and notebook-powered BI. We tested different flavors and frameworks for orchestration. We tried to replace big OLAP databases with Spark databases, and streaming databases, and data lake databases, and small databases. We did ETL in batch, in real time, and in reverse. More recently, we tried gluing AI on to all of it. Just as the bachelorette can speed run through a lot of dates, a hyped industry can speed run through a lot of experimental products and technologies. It’s not the most financially efficient way to figure out who you want to marry or the best architectural framework for orchestrating an operational data pipeline, but it’s the fastest.4
On the other hand, the frenzy is destabilizing. New ideas and standards are churning in and out of vogue; the buzz creates distractions; customers struggle to distinguish between what is valuable and what is hype. The ground on which companies design and sell their products shifts under their feet. Nothing built before 2010 was built for the cloud; nothing built before 2013 was built for cloud data warehouses; nothing built before 2016 was built for dbt; nothing built before 2020 was built for data lakes; nothing built before 2022 was built for AI. Every company and product had to make a bet on which trends will endure—ELT over ETL? Jinja? Open-core products? The popularity of Redshift? Data meshes and contracts?—and hope that they got both their products and their market predictions right. Because of that, hype cycles are great for building, but not great for enduring. Eighty percent of the marriages on the Bachelor and Bachelorette end in divorce, compared to “only” fifty percent of everyday mariages.5
Though it can be demoralizing for the air to leave the room, there’s a lot of opportunity in the slowdown. Startups just need to change their tactics. Don’t build something new, or go after major incumbents—the wilderness is too hard to tame and the cities are too hard to conquer without a lot of money. The better targets are the helter-skelter frontier towns, built by frenzied founders who wanted to stake their claim on any piece of open ground they could find. Some of these companies—a number of modern data stack startups, basically—uncovered valuable ideas, but built them for a different economic and technical era. They were founded when startups were encouraged to raise tons of money and when the only metric that mattered was growth. They were founded before dbt was popular, or before DuckDB existed, or before companies were especially vigilant about cloud data warehouse costs. But now, because the energy is gone—because a lot of the Bachelorette contestants have moved on to the next season—things are more stable.
In other words, to build a great data business, today’s startups don’t have to come up with particularly novel ideas or get a bunch of bets about the market right. They just have to rebuild what’s already there, but more deliberately and on more certain ground than the original pioneers could.6
Omni and SQLMesh are two companies that seem to be doing exactly that. Omni is, almost literally, a Looker rerun, with several of the same founders and investors. Though there are minor structural differences between the products, Omni’s primary pitch is about improved usability. It’s mostly the same thing, just polished and built for 2022 instead of 2012. Similarly, SQLMesh aggressively markets itself as a renovated dbt that’s fundamentally the same product—in-warehouse SQL-based transformations with a web IDE—but built with the advantages of knowing what people liked about dbt, and what they struggled with.
Of course, there’s no guarantee that these companies will be successful. Looker and dbt are popular, well-resourced, and can do plenty of remodeling themselves. But the strategy—build on a competitor’s land, without the scars that come with settling the territory—seems sound.
There are a lot of other startups out there that are overburdened and under-built. They’re overfunded and underwater; in business but out of date; in debt but out of energy. There are businesses that were created for the modern data stack’s party, and now have to endure the slog of startup life without the excitement of the bright lights that used to come with it.
So, it’s time to build.7 Not me, not on this blog—this is a tabloid for promoting my mixtape. But for everyone else, this is a real moment of opportunity. In ten years, data teams will have their Notion, their Linear, and their Figma—second and third acts that had the luxury of time and relative tranquility compared to pioneering predecessors. The enduring modern data stack remains to built. It’ll just happen in the dark.
Clique
Name this company, from the script of their recent rebrand reveal video:
Data unlocks possibilities. It drives strategic decisions. It fuels innovation. It's this promise that propels us to solve challenges with data. Data is our rich history. Data is the opportunity. We've reimagined the future.
X™
We are data-obsessed. And listen closely to create impact. We're setting the stage for the AI-powered enterprise. Moving data from any source to any target. Using analytics for rapid insights to drive informed actions. Turning data into outcomes. We're trusted by 40,000 organizations who understand wherever there is data, there is opportunity, possibility, power.
Wherever there is data.
X™
Ok, I don’t know, sure. But I guess it moves product? Because this idea—that “wherever there’s data, there’s power,” which is the headline on the same company’s new website—is the animating sales pitch of this entire industry. Case in ten points: Last fall, I went to a data conference, and, struck by the mimetic marketing glaze that was painted over the whole thing, took pictures of the taglines that companies used to advertise themselves at their booths. Among them:
Bring life to data
Master your data to accelerate business outcomes
Master data, empower everyone
Set your data free
Data to the people
Unlock enterprise data for modern analytics
Unlocking the value of customer data
Unlock the power of your data
Maximize the value of your data
Helping you make the most of your data
This sort of message has become so ubiquitous that it’s become our water. We scarcely even notice it, much less recognize that there’s a fairly radical claim behind it: That data—What data? Doesn’t matter. All data. Customer data. Enterprise data. Your data.—is full of potential energy. That data, as a matter of faith, is valuable.
But why do we believe that? My point here isn’t to question our religion—it may well be right!—but to question its origins. It can’t be because data is universally useful. Quite famously, a lot of companies struggle to get that much value out of it; much of our industry is also built on selling that challenge. Somehow, though, that gives us more conviction in our creed—that the problem is their atheism, and if only they can be taught our cultural rituals (and, of course, buy our relics), their data can also be unlocked, maximized, set free, mastered, and brought to life. And ironically, some of the biggest advocates for the power of data are inverted doubting Thomases. Their personal experience—at companies with underfunded data programs, recalcitrant execs, and immature data cultures—should lead them to doubt data’s usefulness. Yet, many are our strongest soldiers, fighting the hardest against the false idols of intuition and experience.
So where does this faith come from? Nate Silver? A few legends from Facebook and Target? Moneyball and Brad Pitt and Jonah Hill? Because we saw ourselves as math people? Or have we just been worn down by the same story about data’s inevitable usefulness for long enough that we now accept it as canon?
I don’t know. But perhaps that’s the best way to interpret Qlik’s generic rebrand (Qlik is the company; X is Qlik). Churches would never survive if they promised specific miracles. They survive because people believe that miracles are possible. And hammering everyone with constant messages of faith that are presented as statements of fact—that Jesus died for our sins; that Muhammad is the messenger of god; that the Torah is a divine covenant; that wherever there’s data, there’s power—is the very thing that makes people believe.
benn.substack is step one; benn.buzz is step two. So far, it’s all going according to plan.
I mean, no, this is all kind of a joke. Writing long-form blog posts about databases won’t make you famous; my manager never sent me here to do this, and would probably prefer I just put my head down and did real work. This blog isn’t that cynical. But, the parallels between Bachelorette contestants and tHoUgHt LeAdErShIp are both funny and broadly true. My eighth-grade self didn’t didn’t dream of being a “leader in the dashboarding space;” my interest in writing a blog about data tooling is more general—about economics and finance and social dynamics and gossip—than specific to this market; attention is an addictive drug; and the marriage between me and the modern data stack, though ongoing, is unlikely to be ‘til death does us part.
And, for better or for worse, in both cases, some distant third party is paying for most of it anyway.
Two points about these numbers. First, about twenty percent of both bachelors and bachelorettes are still married to contestants on their show. But, while all of the bachelorettes are married to the man they chose at the end, four of the five still-married bachelors are married to a different contestant. They chose one woman on the show, and, shortly after, changed their mind. I’m sure there’s some statement about the shallowness of man in that.
Second, if you google “how many marriages end in divorce,” the top results are all from law firms. So I have no idea if fifty percent is right, or if it’s just Wilkinson & Finkbeiner doing demand gen.
This is a pretty standard development pattern for products, and it probably works for broader markets too.
Software ate American dynamism, but now it’s time to build techno-optimists. (Again, this is a joke, but it also kind of works? Software got very popular and lucrative; it contributed to eroding the prestige of manufacturing. Now, the a16z stance is that we should be more optimistic about what technology can do, and build things like airplanes and medical devices instead of SaaS software and iPhone apps.)
at some point every friday afternoon everyone on my team spends like 10 minutes talking about your latest blog post
Yet another long-time data company preaching the faith:
"We empower businesses to realize transformative outcomes by bringing their data and AI to life. When properly unlocked, data becomes a living and trusted asset that's democratized across the organization."
TO LIFE I SAY! TO LIFE!!!
https://youtube.com/clip/UgkxggSu3VV65rcBkGc8DLgveYluORVgMSiM?si=5wepS-rctZhpH8-X