Could we really consider all of these consumption modes described here as different data IDEs? They are really just different interfaces to the data, and different tools give you slightly different angles. But where the convergence will actually take place is at the next step: how it's delivered to its audience.
That seems like the place where no one is even fighting yet, because we just assume it's going to end up in Google Docs or Slides or Slack or Confluence or some other static place. And I think it's much bigger than just data, it's about the forum where organizations communicate and strategize.
I think that makes sense in theory, but the products we’ve all all built make it unlikely. I see it like this:
Consumption tools as IDEs would be, I think analogous to code editors. Engineers choose whichever one they like, but they all just ship code to Github. If you use VSCode and I use Atom, that’s fine. If you squint, we could say data work is the same.
Buuut, data tools do a lot more than ship code. They’re really complicated - they have visualization engines, runtime environments, dashboard features, and on and on. There’s not an easy thing to “ship” out of Looker or Tableau or Mode or whatever else that’s not more or less the entire product.
In practice, I think data work is more akin to Google Docs vs Notion vs Office vs whatever else. Each product has parallels with the others, but organizations ultimately have to make a choice for which one to use. If I use Google Docs and you use Notion, there’s not really a Github-type thing that centralizes that. The same, I think, is true for dashboards, reports, and so on.
Yeah, that's true. The tools we've built are both the development environment and the delivery mechanism. Perhaps catalogs could make the argument that they could be used as "consolidators" of different tools, but I haven't seen that happen well in practice personally.
Yeah, that seems to be the ambition of that category, though it feels really tough to pull off.
You could also imagine a different timeline in which all data dev tools consolidated around something like jupyter, where you can just ship code. But seems like that (complete, with a front) ship has sailed.
Tnx, as always, for the great article. It definitely resonates. Re the comment on "Helping people use data effectively is the hard problem", I think that in many cases you can abstract it one additional step; people don't know what questions to ask. Why is this relevant, because when people do see data they try and rationalize it based on their modus operandi (e.g. one healthcare provider we work with said that they almost never change patient orders, but once their were confronted with the data the line changed to oh, those are meaningless or common changes). Don't rock the KPIs or to paraphrase another famous TLA, NIH (Not Ingested Here)...
Thanks! And that’s reasonable, though I’d rephrase “people don’t know what to ask” a bit. I think it’s more that they don’t know what to ask in the same language that analysts speak. In my experience, they very much understand the domain and the problem they want solved; the gap is they don’t know how to describe it in the way a data person would.
Agreed. People don’t know what to ask is a bit too polarized. Part of it is indeed that they don’t speak in the same terms but part is also not having the context to ask the questions (e.g. pre iPhone the consensus was that Internet wasn’t really for mobile and that was also the result of the WAP experience amongst other things).
Thank you for another great article! With the flourish of so many categories, perhaps the future toolkit will just be an assorted mess?
Love your kitchen analogy. I grew up in a Chinese household, where the kitchen was small and primarily consisted of chopsticks, woks, and 5 sauces. But now, the kitchen has to handle Chinese cooking, as well as Italian pasta, French pastries, and English toast. So the stringer, oven, and toaster are here to stay. But in this open world, everyone consumes at least a few cuisines as well some exploratory fusion, and the diversity of tools in the kitchen will keep growing. No two kitchens look the same.
That means the market will keep expanding (great!). But it also implies that no cook will be good at using all the tools (challenging...).
That seems possible; it just seems hard to imagine everything staying this chaotic for that long. Your point is something I didn’t think about too - people have to learn these tools too. Making it easy for users to understand how stuff works also puts pressure on the industry developing some sense of consistency.
Agree with your point that there will be consolidation around a few essentials. Most of the consumption tools are too nichey to establish sustaining categories. The market will shake them out through natural selection .
But it seems, like entropy, the number of consumption categories only increase over time.
Great post, Benn. I'm in the camp that you need to support the tools that users already know and love. In my experience, I haven't been successful in forcing users to change their habits. We want as many people using data to make decisions as possible and that means supporting as many tools as possible. This means supporting a variety of inbound query languages and protocols (SQL, Python, XMLA, REST) into a common semantic (or metric) layer. It's not easy but it works brilliantly if done right.
So in effect, is that the wordpress model (host plugins), or the notion/coda model (build all the interaction paradigms, like a sql client, notebooks, etc, natively)?
I think of it more like an "impersonation" model. In other words, the semantic layer supports a Postgres protocol for SQL, XMLA (SSAS) for Excel & Power BI, Python for Notebooks, etc.). This approach has the added benefit of not requiring additional client-side software dependencies since these drivers will most-likely already be installed.
Yeah, but that only works for languages and code. Take visualizations - is there a way to apply this same approach to visualizations, or do we all just build our own visualization tools>
It works for visualization tools, too, but, in some instances there's more work to do. For example, by supporting Power BI's DAX (XMLA) protocol, the semantic layer is automatically shared within the Power BI UX. Same for MDX and Excel. However, for Tableau, it's 2 parts: (1) support Tableau's SQL generation and (2) inject the semantic layer into Tableau using Tableau's TDS. Like I said, not easy, but doable.
Could we really consider all of these consumption modes described here as different data IDEs? They are really just different interfaces to the data, and different tools give you slightly different angles. But where the convergence will actually take place is at the next step: how it's delivered to its audience.
That seems like the place where no one is even fighting yet, because we just assume it's going to end up in Google Docs or Slides or Slack or Confluence or some other static place. And I think it's much bigger than just data, it's about the forum where organizations communicate and strategize.
I think that makes sense in theory, but the products we’ve all all built make it unlikely. I see it like this:
Consumption tools as IDEs would be, I think analogous to code editors. Engineers choose whichever one they like, but they all just ship code to Github. If you use VSCode and I use Atom, that’s fine. If you squint, we could say data work is the same.
Buuut, data tools do a lot more than ship code. They’re really complicated - they have visualization engines, runtime environments, dashboard features, and on and on. There’s not an easy thing to “ship” out of Looker or Tableau or Mode or whatever else that’s not more or less the entire product.
In practice, I think data work is more akin to Google Docs vs Notion vs Office vs whatever else. Each product has parallels with the others, but organizations ultimately have to make a choice for which one to use. If I use Google Docs and you use Notion, there’s not really a Github-type thing that centralizes that. The same, I think, is true for dashboards, reports, and so on.
Yeah, that's true. The tools we've built are both the development environment and the delivery mechanism. Perhaps catalogs could make the argument that they could be used as "consolidators" of different tools, but I haven't seen that happen well in practice personally.
Yeah, that seems to be the ambition of that category, though it feels really tough to pull off.
You could also imagine a different timeline in which all data dev tools consolidated around something like jupyter, where you can just ship code. But seems like that (complete, with a front) ship has sailed.
Tnx, as always, for the great article. It definitely resonates. Re the comment on "Helping people use data effectively is the hard problem", I think that in many cases you can abstract it one additional step; people don't know what questions to ask. Why is this relevant, because when people do see data they try and rationalize it based on their modus operandi (e.g. one healthcare provider we work with said that they almost never change patient orders, but once their were confronted with the data the line changed to oh, those are meaningless or common changes). Don't rock the KPIs or to paraphrase another famous TLA, NIH (Not Ingested Here)...
Thanks! And that’s reasonable, though I’d rephrase “people don’t know what to ask” a bit. I think it’s more that they don’t know what to ask in the same language that analysts speak. In my experience, they very much understand the domain and the problem they want solved; the gap is they don’t know how to describe it in the way a data person would.
Agreed. People don’t know what to ask is a bit too polarized. Part of it is indeed that they don’t speak in the same terms but part is also not having the context to ask the questions (e.g. pre iPhone the consensus was that Internet wasn’t really for mobile and that was also the result of the WAP experience amongst other things).
Nicely put, as always
Thank you for another great article! With the flourish of so many categories, perhaps the future toolkit will just be an assorted mess?
Love your kitchen analogy. I grew up in a Chinese household, where the kitchen was small and primarily consisted of chopsticks, woks, and 5 sauces. But now, the kitchen has to handle Chinese cooking, as well as Italian pasta, French pastries, and English toast. So the stringer, oven, and toaster are here to stay. But in this open world, everyone consumes at least a few cuisines as well some exploratory fusion, and the diversity of tools in the kitchen will keep growing. No two kitchens look the same.
That means the market will keep expanding (great!). But it also implies that no cook will be good at using all the tools (challenging...).
That seems possible; it just seems hard to imagine everything staying this chaotic for that long. Your point is something I didn’t think about too - people have to learn these tools too. Making it easy for users to understand how stuff works also puts pressure on the industry developing some sense of consistency.
Agree with your point that there will be consolidation around a few essentials. Most of the consumption tools are too nichey to establish sustaining categories. The market will shake them out through natural selection .
But it seems, like entropy, the number of consumption categories only increase over time.
> we’ll never have a standard API into people’s heads.
This tickled neurons from an article I read a few months back https://unchartedterritories.tomaspueyo.com/p/the-tree-of-knowledge
Added to my Jillian-suggested reading list with this article https://futureofcoding.org/notes/alan-kay-lunch.html
Great post, Benn. I'm in the camp that you need to support the tools that users already know and love. In my experience, I haven't been successful in forcing users to change their habits. We want as many people using data to make decisions as possible and that means supporting as many tools as possible. This means supporting a variety of inbound query languages and protocols (SQL, Python, XMLA, REST) into a common semantic (or metric) layer. It's not easy but it works brilliantly if done right.
So in effect, is that the wordpress model (host plugins), or the notion/coda model (build all the interaction paradigms, like a sql client, notebooks, etc, natively)?
I think of it more like an "impersonation" model. In other words, the semantic layer supports a Postgres protocol for SQL, XMLA (SSAS) for Excel & Power BI, Python for Notebooks, etc.). This approach has the added benefit of not requiring additional client-side software dependencies since these drivers will most-likely already be installed.
Yeah, but that only works for languages and code. Take visualizations - is there a way to apply this same approach to visualizations, or do we all just build our own visualization tools>
It works for visualization tools, too, but, in some instances there's more work to do. For example, by supporting Power BI's DAX (XMLA) protocol, the semantic layer is automatically shared within the Power BI UX. Same for MDX and Excel. However, for Tableau, it's 2 parts: (1) support Tableau's SQL generation and (2) inject the semantic layer into Tableau using Tableau's TDS. Like I said, not easy, but doable.
Yeah, the Tableau thing is the sort of thing that seems technically possible, but really hard to do in practice, even if Tableau wanted to do it.