Data is for dashboards

Animating an abstract world is hard, valuable, and worthy of celebration.

Dec 03, 2021

The county I grew up in is known for three things: Fred Durst, serving as the inspiration for Talladega Nights: The Ballad of Ricky Bobby,1 and being the home of several people who stole $17 million from Loomis Fargo in 1997.

The Loomis Fargo heist—which Hollywood also turned into an unflattering portrait of Gaston County2—was one of the largest and most absurd bank robberies in American history. One of the suspects, David Ghantt, was identified nearly immediately: He worked for Loomis Fargo, disappeared right after the money was stolen, and was caught on video loading "cubes of cash" into a van for over an hour.3 Needless to say, he got caught.

The other perpetrators would’ve been harder to find, but within three weeks of the robbery, they moved from a mobile home into a $600,000 house in a gated community twenty minutes from the scene of the crime. They also made a number of large purchases, in cash, of, among other things, a BMW Z3, a velvet Elvis painting, and a statue of a dog dressed like General Patton. One of the final straws came from a tip from a bank teller who alerted the FBI that a woman showed up with a suitcase full of $200,000 bundled in Loomis Fargo wrappers, and asked, "How much can I deposit without the bank reporting the transaction?"

As ridiculous as the story is, its outline is similar to numerous other cases of fraud: People try to present one story, and small inconsistencies (or half-million dollar homes) put cracks in that reality. Elizabeth Holmes was a wildly successful and transformative CEO—until John Carryrou became suspicious of her “comically vague” description of Theranos’ technology in a New Yorker profile. A few years laters, Theranos collapsed in scandal. Robert Hanssen was a senior FBI agent—until a counterintelligence officer read a transcript of a conversation between an American mole and a KGB agent, and realized that the mole used the same General Patton quotes that Hanssen did.4 A few years later, Hanssen was convicted of being a Russian spy.

When we want to make people believe that something is real, details matter. Inconsistencies in those details, from offhand remarks halfway through a six-thousand word profile in the New Yorker to a limo ride to a Western Steer buffet, do more than just raise questions about what seems out of place. They raise questions about everything.

"Data is for decisions"

Ask ten analysts a question, and you’ll get eleven opinions—unless your question is about dashboards. On dashboards, the consensus is universal: They’re bad. Dashboards are outdated, brittle, poorly built, and rarely used. Even our defenses of dashboards are lukewarm.

We all know how analysts and data scientists talk about dashboards: People who ask for them are analytical Philistines; businesses that show them off are a primitive company’s idea of a sophisticated company. Enlightened data teams shouldn’t be dashboard factories; they should help people make better decisions. Data is for understanding which market to expand into, what to charge for a new product, how to model stock trades that always make money, or where to put a new headquarters. And dashboards are a distraction from this much more important work.

I’ve said lots of things like this before, and, strictly speaking, I don’t think they’re wrong. But I’ve come to realize that this reflexive dismissal of dashboards—and more generally, the discounting of old-school “reporting”—undersells something that we need data for that’s even more foundational than helping us make decisions: Data and the dashboards that display it create a shared sense of reality.

Unlike you and I, companies don't exist in the physical world. When we walk down the street, we can see what's in front us, we can feel the sidewalk underneath us, and we can hear the cars driving past us. If we have to decide if it’s safe to cross the road, the data that goes into that decision—the speed of the traffic, the width of the street, and so on—are measures of that physical world, and generally not controversial. If I choose to cross and you choose to wait, the difference is explained by our analysis of the situation, not because we disagree about the presence of a passing bus.

For companies, there is no physical street. There is no actual bus. These things only exist as abstractions. Data and metrics—revenues, retention rates, product usage patterns—don’t measure a company’s world; they are its world.

Take ARR, for instance. Though ARR is related to the amount of money in a physical bank account, ARR is a construct.5 The code that says that ARR is the sum of the amount field on the Salesforce contract object, that overage fees are excluded, that contracts from partners are included, and that ARR is recognized on the date a contract starts rather than when it closes isn’t measuring ARR; it is ARR. If that code changes, our understanding of ARR itself changes.

In other words, the Salesforce calculations or dbt models that encode an ARR metric are accounting identities: They are true by definition. The code is the concept, and the concept is the code.

In this way, companies don’t live in a physical world, but in a virtual one, like those in Pixar movies. The landscape around a company is artificial, and exists only as a rendered representation of the calculations that define it. And to make decisions in that world, you have to create it first.

For example, suppose a company is trying to decide which product to build next. They want to make a good choice; metaphorically, they want to safely cross the street. To do this, they’d first define the street by setting a goal, such as increasing product adoption. Next, they identify what represents the traffic, or the various factors that might be obstacles to their goal, like product NPS, customer churn rates, and measures of development cost. Finally, in order to assess their path to the other side of the road, they then have to render this world by turning data into metrics. But like Toy Story, none of this is real; it’s an interpretation of numbers and a representation of code. If we define market share differently, we change the road under our feet. If we compute churn with a different formula, we could conjure an oncoming bus out of thin air.

If this happens, is there really a bus? We can’t actually say, no more than we can say what the true color of a car is in a Pixar movie. If it’s white in half the scenes and black in the other half, all we can do is reconcile the differences in the code that defines it. But until that happens, neither is more true than the other.

You could make the argument that the true color of the car is what’s in the script—or, analogously, the true value of a metric is the GAAP-approved definition of it. But what do we do when the script doesn’t specify the color, and it’s up to the animator writing the code to choose? Most metrics work this way too. Like a car, “daily active users,” for example, has a generally understood shape, but the details often left to those who actually create it.

As data professionals, this is our first job—to be reliable animators for our companies.

The corporate metaverse

For better or for worse, dashboards are the best tool we have for this. Every dashboard is a window into a Pixar scene, and every metric is a rendering of it. Strategic analysis, for all of its necessary benefits, is more like a magnifying glass: It’s great for uncovering detail and texture, but to get our bearings—to safely cross the road—we also need a wider lens.

A broader view alone, however, isn’t enough. That view also needs to be consistent, across people and time. If a bus is flickering in and out of existence in the distance, or I see a bus and you don’t, nobody will be confident enough to step into the street.

This is why questions like “Why doesn't this dashboard match what I see in Google Analytics?" are both irritating and pernicious.6 Looking at two dashboards that don’t match is like looking out two adjacent windows and not seeing the same thing. Even small or seemingly insignificant discrepancies—one dashboard says 107,102 people visited the homepage this month, and another says 106,988; the car down the street looks white outside of one window and black outside of another—do more than make us suspicious of a small, out-of-place detail. They’re glitches in the Matrix, or hosts off their loops: They make us question the nature of our reality. And for companies, with no physical world to fall back on, the reality that data teams create is the only one they have.

In this light, defining metrics and creating dashboards isn’t banal busywork that gets in the way of “the more rewarding aspects” of analysts’ jobs. It’s one of the most important things that we do. And as an industry, rather than casting dashboarding aside is a necessary evil, we should recognize and celebrate its importance, just as we do for strategic analysis.

The good news is we already have a model for how to do this: analytics engineering.

At its core, analytics engineering should be a mundane role. Analytics engineers spend most of their time tediously maintaining data and managing code, filling an unglamorous gap between the work engineers don’t want to do and the work analysts do want to do. Just as we belittle building dashboards, analytics engineering could easily be tarred as a dull prerequisite to a more interesting job.

And yet, it’s gone in the opposite direction. Rather than running away from analytics engineering, people are eagerly signing up for it.

Why? The attraction, I think, comes from how much the community values the work. Instead of toiling away in the data stack’s lonely salt mines, analytics engineers—largely through the dbt community—can congregate, celebrate, and commiserate together. They can proudly share their accomplishments, and joke about their frustrations. By creating a home and identity for analytics engineers, dbt and its community leaders created a career for analytics engineers.

As data scientists and analysts keen on breaking away from the BI developer roles of yesteryear, we take a lot of shots at dashboards. We implore people to make their jobs about something more. We disparage building dashboards as being beneath us.

It’s not. Building consistent dashboards is hard.7 Creating a reliable rendering of a company’s world is enormously valuable. The only thing it’s not is cool.

We should make it cool. We should embrace our responsibility to create dashboards just as we’ve embraced analytics engineering. We should praise those who are good at it, give those people credit for the impact they have, and encourage them to take pride in doing it well. Because bad dashboards, just like the bad data that analytics engineers valiantly fix, break stuff.

I played soccer and Little League at the elementary school where this scene was shot.

Which itself was turned into the most incredible engagement photos I’ve ever seen.

To hide the robbery, Ghantt removed the tapes from two security cameras. Unfortunately for him, he left the tapes in...sixteen other cameras.

If you commit a crime, avoid anything to do with General Patton, apparently.

And yes, I get that money is all code now, and it too is just an abstract concept on a Wells Fargo-leased server in an AWS data center somewhere. But whatever, if you want to talk about that, I’m sure there’s some crypto Telegram channel full of bored ape avatars that’ll gladly tell you that U.S. dollars are just a construct too.

Substack, what’s up?

To create a consistent reality with dashboards, you have to do a lot more than skim a couple Tufte books. You have to be disciplined: The more dashboards and metrics we build, the more opportunities we have to create inconsistent views. You have to be stern: If people ask for a new metric, even as a one-off request to “pull the numbers,” we are better off directing them to one that already exists. You have to be organized: Good metrics are built on smartly architected data models in dbt (and hopefully, clean definitions in soon-to-be metrics layers). And you have to be communicative: Some apparent inconsistencies—like differences in revenue, bookings, and billings—aren’t glitches, but feel like they are. We have to keep people confident in what they’re seeing, despite these quantitative illusions.

melee_warhead

Dec 3, 2021

Agree with everything you stated.

I just also still wish that many dashboards were less focused on "low-code" or "no-code" and embraced code, so that way we would have trackable systems for version control with auditable business logic.

Hopefully with the rise of an "Analytics Engineer" profession, this can hopefully push for "production" dashboards that can provide more complete tracking throughout the process, and easier customization. (I can copy and paste code easily, but Tableau has a lot of "weird" tricks)

Expand full comment

1 reply by Benn Stancil

1 more comment...

benn.substack

Discussion about this post