11 Comments

I agree that the cache concept is part of what makes things challenging, but also that we effectively have to cache what the state of the world was at most any point of time in the history of the company and maintain that robustly as the world changes around us.

While it is good enough for the billing system to know "who should I bill for a subscription today" the data side needs to know "who should have been billed on this day 2 years ago" and at any point in time in between as well as "who did I actually bill 2 years ago given the billing rules in place 2 years ago and what happened to that bill" in a way that most production applications do not need outside of perhaps an audit or transaction log that someone can inspect when needed.

Expand full comment

Yeah, that's an additional fun complication I suppose, which is that, not only do we have a bunch of caches, but we have to, very roughly, cache the caches so that we can recreate old states.

Expand full comment

Turtles all the way down...

Expand full comment

I once had a project manager on a data warehousing project that said to me one day, "You know what, 90% of I.T. is just Copy and Paste". And he was right.

Data Hubs, Data Lakes, Data Warehouses, and lots of application integration is just copying data from one place to another. Data Warehousing may step it up with a Copy Paste Special Values variety.

The root of all these problems is actually the app-centricity of IT. Why is it that the analyst is left to translate some stressed-out, over-caffeinated software developer's data model to something interpretable by the business? Then we multiply up the difficulty by having to merge multiple application data models into one unified view for the business to understand their own data.

That is why we have the cache's and copy data around. As we try to solve that problem in a multi-step and manageable way.

Surely, an organisation should first develop the data model that supported the data needed to make the decisions they need to make. Then applications update the central model as and if required. It's the responsibility of the app to put their worthwhile data in the right place for the business.

In that alternate reality of no cache's are required, or copy and pasting data around for analytics. The data in the middle is always up to date and in the right shape for the business to interpret. Call it data-centric, data-first, or maybe even business-centric! They are paying the bill after all.

It'll never happen because of, mainly, misaligned incentives, but, one can dream.

Expand full comment

I think that coooould make sense, though for most apps, I suspect they'd argue (and not be wrong in saying) that the data they create is secondary to what the app itself is supposed to do. Like, if you're a sales admin running a CRM, the first thing you care about is helping the sales team do their jobs. I suspect they'd rather trade down on the data model that the data team uses than trade down on how they help the sales team.

I guess that's kind of a form of misaligned incentives, though I'm not sure it's so much misaligned as it just like, competing. But, maybe that's overly cynical, and there's actually a way to build a CRM that does both. And honestly, to your point, I bet that's true - if you set the data model, I bet you could find ways to make the sales side of the CRM just as good. It might be harder, but I bet you could get there.

Expand full comment

This is terrific and I've shared it with many folks over the last few days.

Expand full comment

Thanks! I really appreciate that.

Expand full comment

Not a fan of analogies. Apps and Dashboards solve different problems here. As a user most apps I use solve things where I just need current state (my insta feed, bank balance, navigation etc), vs dashboard is almost always a comparison of states (plus some explainability).

Faster horses here wouldnt solve the problem, even with quantum computers ownership of an ETL would be unclear, may be the data engineer tested the change with the stakeholder who didn't own the change.

Expand full comment

Sure, I hear you on those differences, but I'd argue that's kind of semantic sleight of hand. For dashboards, you still want to know the "current state;" it's just that current state includes historical information. I don't want to know the prior version the dashboard was in. Ie, in your bank balance, you also want to see your historical transactions and prior balances. To me, that's not a different state of the app; that's just part of what "current state" needs to include.

And I mostly disagree about the faster horses thing. Yes, problems of ownership and all of that would remain initially, but if we had very fast ways to do things, I think we'd be able to figure out solutions to those problems much more easily than we could today.

Expand full comment

Enjoyed reading man

Expand full comment

Thanks, I appreciate that.

Expand full comment