The end of our purple era
We love to talk about the future of our tools. What about the future of our jobs?
I want it to be, like, messy.
“brutal” / SOUR / Olivia Rodrigo
—
One of the defining articles of the data industry’s last half decade was Anna Fillipova's piece declaring that data people are purple people, who act as the critical translators between the “red” business teams and “blue” engineers:1
The business people, the actuaries, know what data they need and can define requirements, but typically don’t have the skill set to design a data architecture that gives them the data they need. Technology people typically don’t understand the business requirements, but they can design the data architectures. It’s like the people in IT speak blue, the people in business speak red, but we need people who speak purple in order to create an appropriate solution.
This, Anna argues, is the role of analytics engineers—to “build bridges between disparate entities across the business, connecting technology and people in new and productive ways.”
Her post, published in mid-2021 as the fervor around the modern data stack was arcing steeply upwards, spoke for the generation of data practitioners who came of age in the last several years. Engineers did the bare-metal work of collecting data and keeping things like Kubernetes running. Marketers and operations teams and executives brought us actual business problems. And we, the purple people, turned the former into solutions and insights for the latter. Purple became the unofficial color of a movement—of dbt, of the modern data stack, and of the jobs that analysts, analytics engineers, and business analysts had.
It was messy, creative work, we said, but that was the point. We were at our best when we weren't just building dashboards and mechanically tracking metrics; we were at our best when we were given vague problems, well-sourced data, and the time and tools to go exploring.
The bed we made
I got the things I wanted, it's just not what I imagined.
“making the bed” / GUTS / Olivia Rodrigo
—
A couple months ago, I attended an online conference on corporate accounting. During one of the sessions, the moderator asked a panel how they felt about the current state of accounting. The panelists all agreed that accounting was both important and, unfortunately, doomed to fail inside of most businesses. Most accounting initiatives stall before they deliver anything useful, they said, and today's cost-conscious executives don't have the patience for that.
Ah, hahaha, no, of course not, this is a lie. Nobody's ever said that about accounting.2 When a company needs accounting, they hire accountants who do accounting. Though some teams are probably better at it than others, so long as your books aren't managed by a incompetent and criminal polycule running a Ponzi scheme, most companies' accounting initiatives don't "fail."
The conference was, obviously, about data teams. And the panelists' point—that data projects are prone to failure, and don’t often deliver meaningful insights—wasn’t some bold new take; to the contrary, their point was that, despite all of the ostensible progress in the data industry over the decade, failure was—still, just as it was in 2017—more common than success.
Six years ago, we blamed our problems on “culture challenges,” and companies’ unwillingness to accept new ways of doing things. There were also plausible stories about the modern data analyst being a new role, and our tools being too immature. Our jobs were hard, we said, because the context in which we had to do them made them hard.
In the intervening years, we’ve gotten all the things we wanted. There is little debate anymore about the importance of data: The world’s biggest corporations are plowing billions into data initiatives; more recently, executives have become obsessed with AI, which, while not exactly synonymous with data, is awfully close. We’ve been inundated with thousands of tools designed for every conceivable problem we might have.3 Though it’s possible that we’re one tool away from finally getting good at this, I’m not betting on it.4 And our roles—led by posts like Anna’s—are becoming clearer and more broadly accepted.
And so, naturally, we’ve come up with new obstacles that are getting in our way. Today, it’s talent. According to a Gartner survey, “less than half of data and analytics leaders (44%) reported that their team is effective in providing value to their organization,” and “the lack of available talent has quickly become a top impediment” to success. Other surveys agree: “60% of data leaders were finding it hard to recruit individuals with the necessary skills.”
Ok, look. It’s 2023. If our issue is a skill shortage, it’s not an issue. It’s an excuse.
As early as 2019, Vicki Boykis said she was overrun with candidates: “I’ve developed an intuition that the number of candidates per any given data science position, particularly at the entry level, has grown from 20 or so per slot, to 100 or more. I was talking to a friend recently who had to go through 500 resumes for a single opening.” Since then, vaguely scammy data science and business analytics masters programs have been cranking out tens of thousands of new analysts, who are often heavily-indebted and desperate to work.5 And over the last eighteen months, hundreds of thousands of tech workers have been laid off. There are droves of capable, smart, driven, and prepped candidates in the market looking for jobs. We wanted data to be popular; we wanted it to be attractive; we got it, did it, it’s done.
But for a job to be mainstream, it has to be able to hire mainstream talent. So if we still think our problem is that the talent we have can’t do the job we want them to do, the problem isn’t them; the problem is the job.
We have no problem recognizing that our tools need to mature, and speculating about their eventual end state. What happens if we extend this tradition to our jobs, and ask what they’ll look like when they grow up?6
Die a hero or live long enough to become an accountant
When am I gonna stop being great for my age and just start being good?
When will it stop being cool to be quietly misunderstood?
“teenage dream” / GUTS / Olivia Rodrigo
—
Here's the thing about growing up: It usually makes us boring. We mellow; we become our parents;7 we become Alteryx.
And in our case, we become accountants.
If a singular foundational belief motivates data teams today, it may well be this one: There’s gold in them thar hills. Data is full of insights, and our job is to get them out—either through careful analysis, through enabling others to explore it, or through automated systems that push those insights into operational systems that immediately take action on them. Much of what we do is built as a means to those ends.
That has a bunch of problems though. First, there may not actually be gold in a lot of those thar hills. Most of us may just have “moderately valuable datasets that can inspire moderate business improvements.” Second, finding meaning in data is very hard, as suggested by our continued insistence that most job candidates who are trained to do it actually can’t. Third, even if we do find something interesting, it’s hard and expensive to make it useful:
While the research work needed to arrive at the insight might have been a lot of work, it’s just the start. That work is vastly eclipsed by what’s needed to turn that seed of an idea into something useful and real. Think of the product vision, infrastructure, engineering, design, and marketing resources needed…How many hours of work does that equal?
Finally, by chasing bespoke insights, we build bespoke systems. We design clever metrics that perfectly map to our businesses, measure our performance in ways that are meant to handle the nuances of how we do things differently from everyone else, and tell ourselves that this—that “it depends”—is the right way to do things.
Perhaps it is not. Over the last few weeks, a slow soap opera has been playing out on Twitter and in financial beat reporting on how much Instacart is spending on Snowflake and Databricks. At its core, the debate is about metrics, and how exactly Snowflake calculates revenue retention. It's an interesting detail, if you're into these sorts of things, but it’s not a conversation that any of the people involved seem to be excited to be having.
Though this is an uncommonly public example, this sort of dispute happens all the time—in other S-1s, in boardrooms, and in weekly business meetings. We do things our way; someone else does it their way; people come to meetings with different numbers; chaos ensues; people lose trust in the entire disciple.
Our typical solution to this problem is a mix of technical solutions (data contracts and semantic layers!), cultural changes (peer review and documentation!) and education (don’t abuse your SaaS metrics!). These things are great—data contracts could work; most of my work is a disaster and definitely needs a editor reviewer; every Dave Kellogg talk is a talk worth watching—but I’m not sure they’re forceful enough. To create true institutional trust around the work that data teams do, we might need to do what accountants do: Give our work some rules.
Levers Labs, led by Abhi Sivasailam, is working on exactly this problem. They’ve developed a set of Standard Operating Metrics and Analytics, called SOMA, that is meant to provide universal metrics to measure companies’ operational performance, much in the same way that GAAP standardizes how we measure companies’ financial performance. It’s an ambitious project—not only does it contain hundreds of metrics for B2B SaaS businesses alone, it also includes methods for modeling source data, semantic layer configurations, and templated dashboards and analyses. To me, these adornments aren’t strictly necessary, and I hope they don’t distract what increasingly is: A set of understood rules—not best practices, but expected standards—about how to do this job.
If we never have that, I think we’ll be lost for a long time. People will have to learn every job on the job; trust in data will have to be built project by project, individual by individual. But with something like SOMA, we can lean on our collective wisdom.
The end game
You've been callin' my bluff on all my usual tricks
So here's the truth from my red lips
“End Game” / reputation / Taylor Swift
—
As FTX and Sam Bankman-Fried were imploding last fall, David Roberts explained where he thought effective altruism—a rationalist philosophical and philanthropic movement that was closely associated with SBF, and imploding along with him—went wrong:
It follows that the bigger & more complex the systems you're reasoning about, and the farther out into the future your reasoning extends, the more likely you are to be wrong, & not just wrong, but wrong in ways that flatter your priors & identity. I always feel like this fundamental fact gets underplayed in discussions of [effective altruism] or various other "rationalist" communities. The tendency to bullshit oneself is basically ... undefeated. It gets everyone eventually, even the most self-disciplined of thinkers.
If we humans overcome this at all, it is not through individuals Reasoning Harder or learning lists of common logical fallacies or whatever. If we achieve reason at all (which is rarely), we do so *socially*, together, as communities of inquiry. We grope toward reason & truth together, knowing that no individual is free of various epistemic weaknesses, but perhaps together, reviewing one another's work, pressing & challenging one another, adhering to shared epistemic standards, we can stumble a little closer.
That's what science is, insofar at it works -- not some isolated genius thinking really hard, but a *structured community of inquiry* that collectively zigs & zags its way in the right direction. Any one of us will almost certainly succumb to self-BSing. Together? Sometimes not.
…
In other words, thanks to our epistemic limitations, a "dumb" heuristic that just says "when in doubt, be decent" will probably generate more long-term utility than a bunch of fancy math-like expected-value calculations. We want *resilient* ethics, not *optimized* ethics.
If you replace "ethics" with "metrics," Roberts' point loses its moral imperative, but it becomes relevant to an indulgent blog post about the role of white-collar data professionals. We make progress as a discipline not by reasoning our way through our problems individually—as tempting as that may be, given our rationalist proclivities for thinking from “first principles,” or whatever—but through social inquiry, in which we can all stumble forward together. And resilient standards are better than theoretically perfect ones.
The potential of SOMA isn't in its precise framework of models and expressions. SOMA’s promise is in encouraging—compelling—us to embrace a mindset of reuse over reinvention, and of using a shared standard as an ending point to fit our businesses to, rather than a starting point to adapt from.
One of the first posts on this blog asked why nobody’s been able to build a successful library of analytical templates for common metrics and dashboards.8 My answer was that the similarities between businesses was a mirage: “From a distance, the differences between a 5 iron and a 9 iron are minuscule—just a few degrees of club head tilt and a couple inches of length. But those differences are the very things that define each club as that club.” Companies, I said, are the same—to ignore their differences is to ignore their most important features.
I now think that that mindset is the one we need to change. It’s this refusal to fit ourselves to an imperfect standard that keeps us from moving forward, and sows seeds of distrust in what we do. We’d go further as an industry if we accepted a standard and demanded that our businesses measure themselves against it. Counterintuitively, we’d go further if we didn’t think of ourselves as intrepid explorers searching for novelty, but embraced our role as accountants—reliable, pedantic, and, actually, less boring than us.
If consolidation is the end game for our ecosystem, that feels like the end game for us. We should borrow and steal from those who blazed the trail ahead of us, become more like other business people, the actuaries, and rebrand ourselves from purple to red.
Picking losers
If you start a company that gets big, you will get rich. If you start a company that gets very big, you will get very rich and they might make a movie about you.9 Because of these movies and people like Bill Gates and Elon Musk and Jeff Bezos, most people think that founding a company is the best way to get rich in Silicon Valley.
It is not. The best way to get rich is to become an executive at a big company that's on its way to becoming a very big company. These companies will pay their executives handsome salaries and grant them large stock packages. Though these executives will not make as much as a founder, they take on much less risk and do not have to work as long before their equity is liquid. And if you are an executive at a company that goes public or is bought in a big acquisition, you will have a gold star on your resume that will make other companies want to hire you for even more money.
This is a well-understood phenomenon in Silicon Valley, and people will try to exploit it. If you ladder together a few consecutive jumps—pick a winner, collect the payout from the IPO or acquisition, and, because companies will often hire you at a level above your current role, market that “success” into a bigger job at another winner—you can make a lot of money very quickly.
It’s not a perfect plan though. Picking winners is hard. And employers also don’t like hiring people who change jobs a lot. If you do this too much, the short stints at successful companies will be black marks, not gold stars.
A new company called Prospect is trying to solve the first problem. It “uses the same data VCs use to give you an independent third party projection of what your equity is likely to be worth,” and helps job seekers find companies worth betting on. As someone who owes their career to a lucky break like this, this seems like a useful service. But for the shrewd mercenaries who want to compound these jumps together, Prospect doesn’t solve the second problem—employers still don’t like job hoppers.
There is another option? I was recently talking to a friend who took two consecutive jobs at two startups that both imploded shortly after she joined. In both cases, they fired her and paid her severance.
Perhaps, then, looking for winners is a fool’s game. They’re hard to find, and their jobs are competitive. What I want is a service for finding companies that are teetering over the edge. Get hired easily at an inflated title because they are desperate; work stress free because, who cares, deck chairs and Titanics and all that; get laid off; collect severance; use my inflated title to get a bigger job at another time bomb. I may not make as much as I would if I picked winners, but I also would barely have to work, and nobody would question why I left Theranos after a six-month layover.10 Plus, if you pick a disaster that’s big enough, you can become a whistleblower and they will still make a movie about you.
This quote is from Anna, who borrows it from Deloitte’s Thomas Davenport, who heard it from a data engineer named Jim Wilson, who was recounting something that his boss Kimberly Holmes told him. Someone please keep this game of telephone going.
Or maybe they have? I have no idea; I’ve never been to a conference on corporate accounting.
Tired: Buy or Build?
Wired: Buy, Build, or Become a design partner for a YC data company that’s trying to find product-market fit and bend their roadmap around your exact need so that you get a bespoke solution for free?
Sure, and it’s also possible that I’m the right tennis racket away from making the U.S. Open. It’s just not that one. Or that one. Or that one. (Also, I’m a casual tennis fan at best, but I’ve probably watched that Alcaraz-Zverev point fifty times. Those last two Alcaraz shots—the pitch of his yell, the crowd gasping on the first shot and starting to cheer before the second because they knew what was coming, Zverev reacting in almost the exact same way, and the realization after, frozen on the face of man in the crowd with the white shirt and gray hair, that in every other Alcaraz shot, he’s been playing at eighty percent, being conservative, doing only what he needs to do to but not what he can do—top ten low-key sports highlight I’ve ever seen.)
Professional masters’ and “certificate” programs, like those in business analytics and data science, are notorious for being exploitative cash cows.
Erik Bernhardsson asked a similar question a couple years ago.
Admittedly, anytime something goes horribly, comically wrong inside of a company like Theranos, FTX, or Twitter, part of me wishes that I worked there, for the stories.
"Most of us may just have 'moderately valuable datasets that can inspire moderate business improvements.'"
<<nodding head>>
Once upon a time in a prior century I taught classes called "Strategic Business Analysis" and a modeling class on a combination of ERD, Function Hierarchy Diagram and CRUD matrices using Oracle Designer. I taught these as "requirements and understanding the business gathering", not designing a database. This worked so well that Oracle Designer could generate 95+ of a finished application, i.e. all of the grunt work.
ERD in particular was taught as a "thought discipline", with heavy focus on meaningful working of entities, attributes and particularly relationships.
A part of the offerings, I would hold an afternoon or evening session with the business and stakeholders using a group reading of their ERDs with heavy emphasis on speaking aloud the MAY BE, MUST BE of the relationships (ala Barker et alia).
That was what we called as "Business Analyst"; they may or may not have been database experts.
I got in hot water when working for Oracle on a very early clinical study application. I trained the nurses to read the, who the provide feedback to the Oracle consultants, who were not happy to have their work questioned.
I don't think such a position is very common these days.