A strange thing happened, during the opening keynote of the fourth edition of Coalesce, dbt Labs' annual conference that was held this week in San Diego. For ninety minutes, through stories about the past, present and future of dbt; through several product releases and roadmap announcements; through interviews with customers—through all of that, at a tech conference, about data, in the year 2023, there was never a single mention of AI.
There were no AI-powered features. There were no chatbots; no code-assistants; no promises that documentation will one day write itself. There were no lofty predictions about how we’re entering a new technological era, and no dire warnings about how companies that ignore this generational wave will be left behind. There weren’t even any jokes about how AI is all we ever talk about.
Instead, there was a discussion about process. dbt Labs launched an architectural framework called dbt Mesh, which is as much an operational best practice as a product update. They rolled out new capabilities to help data teams standardize their deployment processes. At the end of the keynote, a customer shared a story about how they solved their data governance and reliability problems with a combination of dbt features, organizational updates, and improvements to their collaborative processes.
As an occasional advocate for boring things,1 I’m probably obligated to say that is a good thing. The problems that data teams deal with aren’t likely to be solved by some fancy flourish of AI, or by a wrapper around ChatGPT. They’ll be solved by a bunch of people doing the tough legwork that’s necessary to gradually push us all forward. And the features that dbt Labs released this week are that sort of legwork.2
People who work in finance periodically joke that a lot other people who work in finance are making bad decisions right now because they’ve never worked in an era of high interest rates. Though rates aren’t that high by historical standards, they’re nearly double what anyone with even a decade of experience has seen in their professional lives. A 34-year old trader might know, abstractly, that interest rates could go as high as five percent—or even well higher—but they’ve only read about it in textbooks.3
There’s arguably a reputational version of this phenomenon in the data industry. For the last ten years, we’ve all been told how important our work is. McKinsey and HBR constantly wrote urgent reports about their weren’t enough of us to satisfy the explosive demand for the skills we had. Our careers had a sense of open-ended potential. We could become the next business leaders, the next CEOs, the next GMs, next founders of companies that VCs would pour millions of dollars into. If our work was tinged with drudgery, that’s been ok; our resumes were still sparkling. We were living in New York or San Francisco—it was expensive and cramped, but we were at the center of the world.
It’s been a tough year for that narrative. Our careers, once inevitably important, suddenly seem more expendable. Data teams have been hit by waves of layoffs. The “modern data stack” went from being an essential buzzword to an outdated pariah. Our main-character strut is turning into a slump: At Coalesce, several VCs told me they weren’t very excited about any of the early-stage data startups that they’d met recently. The big opportunities are elsewhere, they said.4 San Diego was fitting: Data was moving to periphery.
Though we talk a lot about what might happen when the world’s money moves on from our industry, we don’t often reckon with what might happen when the world’s attention moves on as well. As I was sitting in the audience at Coalesce, listening to the necessary-but-dull things that need to be done, a thought occurred to me that I had previously missed: Do I actually want to do any of those things?5 To what extent am I here because of the day-to-day work, and to what extent am I here, working in that darkness, because I believe that it will one day put me in the light? And what happens if the promise of that light drifts away?
The hotel convention center in which Coalesce was held apparently does good business. Over the course of the past week, an association of police chiefs came and went; yesterday, the hotel lobby was being redecorated with banners from the American Society of Emergency Radiology; thousands of fundraising professionals are on their way this weekend.6
I found a few pages of notes from some prior event in a lectern in one of the conference rooms. The document—a “Summary of the eCOA Signature Discussion”—opened with four bullets:
The expectation is that there is a link between the evidence and the reviewed data. With DCT and new technologies, there should be a change in behavior to ensure review and evaluation of reported data is done in a timely manner and evidence of such review is retained.
The worst that can happen is that the box is checked, and the investigator has never logged in to read the eCOA data. Or the box is checked but changes are made after.
In the best practices document, we discourage using the eCRF signature as evidence of the eCOA data review. There is no linkage between the eCOA data and such signature.
The evidence of oversight on eCOA data seems to be more important when the data has been generated by a site person, delegated by the investigator to create entries.
You could reasonably argue that the data industry maturing into something more rigorous is a good thing. You could reasonably argue that we need rules, regulations, and our own form of GAAP standards.7 You could reasonably argue that the hype and gold rush of the last decade has attracted opportunists and attention-seekers who aren’t here for the right reasons, and it’s well past time to take away the punch bowl. You could reasonably argue that it’s time for us to become professionals, like everyone else.
But if that’s where we end up, I hope that this document—this calcified fossil; this real-life TPS report; this dire warning about the worst thing that can happen: A box on a compliance form is checked in the wrong order—is not indicative of who we have to become, and how we have to talk about what we do.
As we all left Coalesce this year, the prevailing whispers in the hallway were that it was gradually becoming less of a community meetup, and more of a corporate marketing event. There were fewer practitioners and more vendors, people said. There were fewer product releases for open-source enthusiasts, and more for large enterprise buyers. There were fewer technical talks and more sponsored infomercials.8 Nearly everyone I talked to said that, while they’d be back next year, they hoped that there would be more practitioners, more technical content, and more community events.
Bluntly, I hope that dbt Labs doesn’t listen to closely too that feedback. To me, Coalesce has never been about the content; not really. There are other meetups that are less corporate; there have always been other conferences that are more technical; there are other vendors and products that are exist exclusively for the practitioners in the trenches.
Coalesce has been singularly unique for a different reason—it was an unabashed celebration of its attendees. The content of the conference was simply a vehicle: It was a lens through which people could see themselves in one another; it was an excuse for all of us to brag a little about the obscure or mundane accomplishments that nobody else would understand; it was a way to create a cheering section. People may have come to Coalesce for tips and technical tricks, but they left with moments—moments of connection, elevation, and pride, created by people who refused to be quiet in their appreciation of one another.9
Technical content alone can’t create these moments. I’m guessing that the eCOA signature discussion was a useful conversation, full of valuable tips and actionable takeaways—and I’m also guessing that nobody left that session feeling anything.
I don’t know if Coalesce was intentionally engineered to be different, or if it simply evolved that way, because of its origins on Slack and as a virtual event. But as data becomes less of a fad and more of a profession; as our work becomes less lofty and more legwork; and as data recedes from being a main character and becomes a supporting part, I hope that future versions of Coalesce make explicit efforts to create as many moments for as many people as past versions have.10 Because nice as they are, parties, and advice, and a bunch of LinkedIn connections aren’t that special—what’s special is giving people who are increasingly working in the dark their moment in the light.
An update from last week
After the post last week, a number of people reached out to me with stories to share about the war in Israel and Gaza. Though my intention was to include them in this week’s post, my obligations for Coalesce ended up taking more time than I thought they would. I still plan on sharing those stories (pending folks’ permission), and sadly, it looks like they’ll be just as relevant next week and in the weeks after as they were two weeks ago.
So, to those of you who sent me messages, thank you. And if I haven’t gotten back to you yet, I will.
And for The Process, and for products that make processes better.
Will they work? I have no idea. But they’re much more likely to work than a flashy dbt chatbot that serves no purpose other than to be marketable as dbt AI.
“Do you think you know what it's like to be Paul Volker because you read about him in a book?”
Yesterday, we felt like dancing; today, we’ll watch the world pass on by. (But seriously, why are you reading this post, vert1go vol. 1 came out like six hours ago, get out of here and go listen to it.)
Hahaha no, nobody lets me do any of these things anymore. But do I want to write a blog about organizational architectures for designing durable data governance frameworks? Do you want to read a blog about organizational architectures for designing durable data governance frameworks?
A cop, an analytics engineer, an emergency radiologist, and a fundraising professional walk into a hotel bar…and they’re all a joke, because they’re all still wearing lanyards and corporate backpacks.
Is “GAAP standards” like “ATM machine?”
These things may or may not actually be true. It’s possible that this edition of Coalesce had more community talks than any other, or the highest ratio of practitioners to sponsors. But, for better or for worse, how people feel about these things matters more than what is actually true.
You could argue that this is actually bad, and that it oversold an industry that was bound to come back to earth. Which, ok, but, whatever. My point here is that being celebrated is a rare and powerful thing. People can come to their conclusions as to whether or not what data people do is worth celebrating (though if it’s not, the solution is rather obviously not to do the thing at all, not to not celebrate it).
The good news is that I don’t think that the predictable and understandable evolution of Coalesce—”as they add more enterprise clients, community-led product development will give way to dollar-driven roadmaps and exclusive advisory boards”—is incompatible with this.
So let me be more specific. The way I’m seeing Data Factory being used (a component of Fabric) is taking SSIS packages and now running them in the cloud. That’s not the only way to use data factory - but as long as you don’t have to rewrite SSIS jobs I imagine many people won’t... but then you have the limitations of the tech that was already in place - with some benefit of now having a cloud runner. PowerBI is also part of fabric - and the reason I hear from others as to why they use it is ALWAYS “because it was cheaper” not better. Plus the cheaper was just in LICENSING cost (not counting other costs).
The bizarre part - Microsoft Excel is a tool people still truly love. I’m sure MS product teams would love to bottle that Excel lovin’ and spread to other MS data products - but IMO that hasn’t happened.
I read this and worry that my concerns are real that the data world has been hijacked by people who would rather create governance frameworks and controls, than find (or allow others to find) insights and understanding in data. I think this is in large due to its promotion to ‘the new oil’ and the explosion in roles which are often siloed in the data stack. The increase in status leads to greater scrutiny, and the increase in silos increases the likelihood of errors and mistakes due to lack of insight…which leads again to greater scrutiny.
In governance specifically, I think there is a data lifecycle which starts being produced by business process, when that process is entered into an application, and via ingestions and transformations into reporting, through influencing and into action/decisions. So far regulatory governance approaches I’ve seen only inspect the problem of (a small amount of the ingestion,) transformations and models, but still manages to miss the two key ends of the process. I believe this is one reason why it’s boring - it seems like overhead that is added doesn’t solve the key issue of knowledge gaps in understanding what processes, developer choices, and manager idiosyncrasies help data be productive and valuable