We can’t build something unless we know what it is.
Yes Benn! We haven't worked together at a company.. but as a former CFO working with IT/developers on the reports/dashboards/queries that you have described above, it is like you just described 10 years of our separate/together joint life! 🤣😭🤘
We all wonder how many decisions at a company are being informed by inaccurate reports from 3 years ago built for a 1 off purpose... Ugg!
As I was reading through this post (which I really enjoyed, thank you for writing it!) I was trying to relate the scenarios you describe to my current and past work. Upon reflection, it seems that the model described is more applicable to reactive data teams (which is many if not most of them!).
If the data team is the answer-producing black box that business stakeholders come to every time they need an answer, the observations you make about things being one-off or in that muddled we-built-it-as-a-one-off-but-people-want-it-updated area make total sense. But creating products (and taking them into production) should be more proactive and collaborative than that, or at least aspire to be, no?
In my last team, we had a very explicit set of products we set out to build. They weren't in production from day one, but the vision was one that was the joint product between the commercial leadership team and our department. Instead of waiting for different business teams to come with their queries (almost all of which would end up being answered with one-off projects), we set out to create reusable, extensible data assets that standard models could be built on (and that were then also used for some one-off projects, or given to local teams who wanted to do one-off work). What was production grade* (reliable, refreshable, quality-assured etc.) was the underlying asset (standard data models, standard ML models), which was where the team spent most of their time on.
Of course, many of the challenges you describe crept in anyway, and the products weren't perfectly... product-like from the start (I wrote about this last month, if you're interested: https://bit.ly/productisation), but they definitely weren't a continuous and recurring pain.
Going from reactive to proactive is hard, don't get me wrong. It requires business partners to change how they work, and because it requires analytics translators and data product managers to occupy the chasm between data and business successfully, and proactively identify and explore areas where these sort of product opportunities might exist.
I started off writing this comment thinking I'd be outlining where I disagree with you, or at least where I don't think the observations you've made apply. Thinking it through, what I think I've done above is actually just fill in some blanks: If a data team is stricter about defining production, and as a result commits more but less often, then it has the luxury (and probably the prerequisites too) to be much more proactive.
Upvote for self-destructing or aggressive janitors (see https://www.linkedin.com/feed/update/urn:li:activity:6966069137496293376?commentUrn=urn%3Ali%3Acomment%3A%28activity%3A6966069137496293376%2C6966258477782433792%29)... combined with telemetry and lineage (very much easier said than done).
Not sure if we can find inspiration from the design world, but I certainly see very similar challenges. Lots of ephemeral assets and loose boundaries around "production". Perhaps Figma (aka Adobe) has an answer for us.
(1) This reminds me of the problems associated with providing a centralized feature store (managing SCD type II data) to make it easy for people to build machine learning models. Without proper versioning constructs, and SLA tiering, it's hard to set expectations for everyone involved and it's easy for it to devolve into a mess as you describe...
(2) Is part of the problem because we don't have the lineage & telemetry insights to determine what's actually running and whether it's being used or not? What if you could get column level lineage from source to dashboard and then display this information as to what "paths" are being used and by who? Would that help catch where "production" expectations are misaligned?
In our company the term certified reports was coined somewhere in 2000, so the product idea is not very new. It would be new if the product thinking is applied to the data entry (take that serious at last) and each of the intermediate steps before the report comes up. Which also is not completely new if you consider that the data mart was supposed to be 'a disposable product'.
Hey Benn, This looks incomplete. I mean - Deprecation is easier said than done since - i bet - data teams wont largely find out what to deprecate. (unless costly catalog, governance projects defining ownership, stewardship etc).
Also, "Data Team" itself is difficult to define - are these meant to be central data orgs, embedded BI teams in different business units, CEO office, consultants, etc, data mesh enthusiasts etc. How would they now explain internally - production on the top of the jargons of trusted/gold data assets/tables/reports etc.
I think since Kimball's days - the thought around production, confirmed dimensions etc was always there. It was just painfully slow and costly to implement.
Data teams products, in my opinion, - are more and less - like MVPs, that do not reach production status. May be because the value of the MVP has diminishing quality (if a report tells me that payment issues are reasons for churn, the report's value drastically reduce after the discovery)
(to quote your earlier blog, its actually not web-services world that data products should mimic, but journalistic)