Chasing ghosts

How do you get better at something you can’t see?

At the center of Facebook’s widening gyre, amid investigative reports, 60 Minutes segments, Congressional hearings, sinking stock prices, calls to break Facebook apart, calls to protect Facebook’s valiant and compassionate efforts to build a more connected and communal world,  defiant takes from technologists who insist that Facebook is no different than TV or Teen Vogue, and, of course, server cages breached with angle grinders, are a few dozen slides of charts and bullets that, if not global in their scale and cataclysmic in their implications, are no different than the mundane decks many of us routinely pass around in our own jobs. For those of us who want to measure ourselves by the impact of our analyses, the new bar has been set at lopping $50 billion off a company’s market cap in 48 hours.

The spectacle recycles a series of important issues that periodically bubble over in Silicon Valley, most notably about the role of social media in our society and precisely how cataclysmic it is or isn’t. There are plenty of further conversations to be had on these topics—and I’ll leave them to brighter minds than mine.

For me, a much smaller sideshow caught my eye among the circus’ main acts: The rare exposure of internal research, and the debate about how we interpret it.

The call from inside the house

In late 2019, a research team inside of Facebook conducted a study to assess how Instagram affects its users’ mental health. The results, outlined in two decks that were leaked to the Wall Street Journal, present a nuanced set of results, including several alarming—though not exactly surprising—concerns about Instagram’s corrosive effect on teenage girls’ body image. The Journal’s story emphasized these conclusions; Facebook responded by throwing everyone under the bus, including their own researchers (who, for now, remain anonymous). According to Facebook’s annotations of its own work, the original research used inappropriate and myopic language; its visualizations were misleading; it didn’t acknowledge when results were and weren’t statically significant; it implied that relationships were causal when they weren’t; and it was based on biased research methodologies. 

And thus, we arrived at a rather unusual destination: a public debate about the content of the research itself. At least on the surface, the leak spilled what are typically internal disagreements out into the open.1 The conversation wasn’t about the makeup of the research team or its mandate; it was about its actual work. 

This highlights a hole in the center of many of today’s public discussions about analytics. In the middle of the analytical field, in the eye of our swirling hurricane about titles and team structures and tooling, is analysis itself. It’s the data, the charts, the narratives, and the recommendations that contain the conclusions we uncover. It is, in many cases, the culmination of everything else a data team does—the point of all the other parts of the job we so frequently talk about.

We struggle, though, to look at it directly. The recent conversations about the analytical craft glace its direction, talking about the skills we need, how we develop intuition, and how we can credential our skills. But the analysis we produce is still only in our peripheral vision, at the center a black hole: unknowable on its own, only detectable by its effects.

The common—and very reasonable—explanation for this is that most analysis, like most Facebook research, is proprietary. To look inside an analytical report is to peer into a company’s soul. Most companies prefer to keep that private.

That, however, makes our job as analysts a lot more difficult, especially for those just entering the field. To extend Randy Au’s woodworking analogy, the veil around analytical work forces us to talk about the saws we prefer, the types of wood we like, and the paths we can take through our apprentice program without talking about the actual chairs we build.

And worse still, analysis isn’t a chair.2 Even if your woodworking classes can only gesture about chairs, you can at least easily judge the ones you build. You can see how it looks, or sit in it and decide if it’s comfortable. Can we do the same for analysis? If we can’t see others’ work, can we at least say what makes it good?

Good analysis is...

The challenge isn’t to find an artificially empirical way to measure analytical work. But because we can’t score it precisely doesn’t mean we don’t score it at all. When we read some bit of work—a report a team member puts together; the Facebook slides; Dataclysm, the book inspired by the OkCupid blog—we make some determination about its quality. How? How do we complete the sentence, “this analysis is good because…”?

“This analysis is good because it uncovers the truth.” This misrepresents the problems many analysts are asked to solve. Though we’ve borrowed the term “science,” we don’t deal in absolute truths; we aren’t physical scientists trying to document the laws of nature. As the Facebook research shows, the truth is nuanced. Nobody disputes that Instagram affects teens; the question is how much and how meaningfully. Even seemingly simple questions like “what is our sales attainment rate?” are littered with subjectively, with no ground truth underneath. 

“This analysis is good because it leads to good outcomes.” This only holds up when decisions are repeated thousands of times—for example, when you’re deciding to send a customer a promotion or to hit on a blackjack hand. Most business decisions, however, aren’t repeated like this, which complicates the usefulness of outcomes as a scoring mechanism. Just because we lose a single blackjack hand doesn’t mean we made the “wrong” call. But, conversely, it’s cold comfort for Seahawks fans to know that, across hundreds of different universes, passing was the “right” decision.3 In neither case does the outcome of the decision feel like an appropriate jury.

“This analysis is good because it’s persuasive.” I’m sympathetic to this view, and I’m not alone. If our job is to influence decisions, our analysis only matters—and is only good—if it accomplishes that goal. But once again, the Facebook research shows the limits of using this as a ruler. On one hand, according to Facebook’s annotations, the research was too bold. It was careless in its conclusions and oversold a story that wasn’t really there. It needed more rigor, more nuance, and more measured language. By trying to be persuasive, the work undercut itself. On the other hand, the research was too muted. Despite trying to sound the alarm about Instagram, the report was mostly ignored. At best, Facebook responded to the concerns by commissioning more endless research.4 By this measure, the original report wasn’t persuasive enough. And more generally, in a world in which Joe Rogan can convince millions of people to forgo their vaccines, we probably shouldn’t say that persuasive analysis is good analysis. Smart people with opinions can make their opinions look smart.

“This analysis is good because the experts say it’s good.” This is riddled with problems. It’s circular; experts are presumably the people who do good analysis, begging the question of what good analysis is. It’s insular, and scores analysis by how much it conforms to the views of those who are already in positions of power. And it’s incomplete, as it provides no direction when, in the Facebook example, the experts disagree. 

Perhaps there are other methods. Good analysis uncovers something expected—but unexpected results can also be wildly inaccurate. Good analysis is hard to poke holes in—but that implies that quality varies depending on who’s reacting to it.

It feels to me, then, that our actual answer is an analytical classic: It depends. There is no single axis on which we can measure our work; a combination of factors determines its worth. And ultimately, even that rough rubric can be overridden if we, like a book critic taken by a compelling novel, feel strongly enough.

In some cases, we’re swayed by our own priors about the problem. In response to the Instagram leak, Mike Solana, a VC and self-proclaimed free thinker if there ever was one, predictably sided with Facebook. Alexandria Ocasio-Cortez and Elizabeth Warren read the same research and came to the opposite conclusions. 

In other cases, we develop an affinity for the analysis itself. OkTrends, the former OkCupid blog written by Christian Rudder and the precursor to Datacylsm, is revered in analytical circles. Yet, it’s tragically biased, based on the preferences of a particular subset of a particular generation who used a particular dating app at a particular point in time. But it’s nonetheless part of our canon, because, I think, it’s two things that most analysis isn’t: entertaining and accessible.

The good news is that this points to a way forward. OkTrends is heralded because it’s considered to be among the best of what’s available to judge. We need a wider library to choose from, especially if analysis, like so many other crafts, is inherently subjective. While few of us are as entertaining as Rudder, plenty of people may be better analysts. But unless we see their work—unless we figure out how to talk about what’s currently hidden behind NDAs instead of talking about teams and languages and tools and tools and tools and tools and tools—we’ll never know how high the bar could be set. 

1

Obviously, in this case, Facebook’s claims about the research are in part pretense for discounting the results in the report. While there’s nothing inherently sinister about disputing research findings—a company as large as Facebook produces an enormous number of internal documents, and plenty will be flawed or incomplete—these reports clearly aren’t amateurs' sloppy drafts. They are, by all appearances, the best efforts of world-class (Facebook’s own term) experts. And not the kind of experts that tech stans put air quotes around, and smear as stiff bureaucratic suits who stifle innovation and cower from creative destruction; no, these are the experts who left academic jobs for the disruption, the Facebook employees who want to move fast, stable infra be damned. Unless, of course, the difference between an expert and an “expert” is whether or not you agree with their opinion.

2

Facebook, of course, is. (If Facebook wanted to discredit the work of their research team, they should’ve just pointed to this ad as proof that highly qualified people sometimes produce, uh, questionable work.)

3

It’s tempting to define good analysis as leading to the right outcomes more often than not. But in cases when decisions are only made once, this becomes tautological. The analysis can’t be wrong if it’s also the instrument that tells us what would’ve happened more often than not. 

4

Zuckerberg’s response to the story talks a lot about the teams Facebook funded and the research it supports; notably absent is any mention of anything that Instagram actually did in response to that research.