6 Comments

I feel like the ultimate elephant in the room is having a unified understanding of data across all stakeholders of it (so that you can better communicate, operate, etc.). This includes the business-facing, data scientists, data engineers, etc. I agree that an unified pane will provide a much needed management console for these people. I further believe this management console is one potential solution for this bigger knowledge issue. However many of the data products tackle this issue by selling their products to the top (i.e. head of data) who then enforce the product on their subordinates. I'm curious on your thoughts of the reverse. Create products that individuals like data engineers use regularly in their workflows. Then aggregating all of this into one cohesive console - creating a sort of hive mind understanding of data at the end. Not sure if there are any examples of this out there.

Expand full comment

The best example of that I can think of is quora / wikipedia / stack overflow, the approach is more inductive than deductive. You learn from the examples in the field, and generalize them into some sort of rule.

I’m sure there are tons of challenges with that though, not least of all that the examples are probably inconsistent? It’s hard to use induction to pick the right economic policies in the US, because nobody agrees on what those should be. I guess you could rank popularity in some way, though that’s not quite the same as what’s true.

Expand full comment

Yeah makes a lot of sense. I feel like this is the age-old question: what is the right balance between free play and structured play? Structured play allows for more consistency but less flexibility (i.e. you have to do things a certain way). Free play allows for more flexibility at the cost of less consistency. It's hard to find an optimal solution (if one even exists in the first place) but I feel like it should contain a mix of both free and structured play.

Expand full comment

I think you have to support both. Ideally, most people would stay on the roads (which is where they want to be, tbh), but there are ways to drive off road as well. The hard part of that (which nobody’s really solved) is that those two things don’t overlap at all. It’s really hard to take the road somewhere, and then offroad from exactly that point. You usually have to start from the beginning again when you want to “free play.” That creates a lot of inconsistency and frustration, because you duplicate things and all that, just to the starting line.

Expand full comment

I see external factors also influencing the 2032 reality or the possibility of a single view of data For eg. 1) Source data (Telemetry) to a large extent is not yet standardized, do we see more source problems in 2032 meta world or more standardization. 2) Would enterprises substitute a large part of their microservices with SaaS tooling (eg. A world where companies always use stripe and its competitors for payments, with similar export schemas a.k.a data cloud), or would they continue building their own in-house services (may be they wish to avoid overhead of managing too many vendors, legal, privacy and procurement issues)

Expand full comment

I suspect it remains a mess. People might consolidate around fewer standards and defaults, but it's hard to imagine there being an overwhelming consolidation around one or two tools. Salesforce, for instance, seems as dominant as any vendor could reasonably be, and it's 1) got 25% market share, and 2) requires so much customization that people's schemas end up not being the same.

Expand full comment