Discussion about this post

User's avatar
Pietro Casella's avatar

@bennstancil I love this idea of future apps having schemas (and more generally architectures) that optimize for #llm (not human) convenience. #schema-for-bots or #schema-on-bits or something 🤓

Expand full comment
Jorrit Posor's avatar

Thought provoking post, thanks Benn!

Do you have thoughts regarding the following:

The full-joined-event-tables (FJ) that you describe as being good for AI, are in my experience the very end of the DAG. In our dbt environment we have around 1800 models and use the (FJ) like models for exposure into BI tools.

Running generates SQL on these FJ models is kinda trivial because all you do with FJ models is filtering and aggregating. All the complex joins that might require business process knowledge have already been done for the AI.

So querying FJ models is not hard and also the smalles fraction of what our data department (at a scaleup) does.

The big junk of work (analytics engineering) goes into the construction of all the say 1700 models which in the end land in many different FJ tables. This junk of work would be interesting automating. But here AI is missing a crucial piece of information.

What’s the missing piece for AI? The understanding wrt to the business processes. The data landscape is so super fragmented: fetching data from 80 SaaS tools, internal APIs, public APIs, data from same sources being interpreted in different ways (business processes) depending on region ... chaos in terms of data integration.

So the hard part for the human is mapping the fragmented, ever changing, always under-documented business processes onto the data these processes create. This is so hard that people need sit in meetings and exchange business processes knowledge from brain to brain via communication.

Without this business process knowledge, where the most up to date version sits in brains, data modeling cannot be done. And hence an AI that does not somehow aquire this business process knowledge, can not produce meaningful data models.

Maybe all AE should become some kind of documentation / config file maintainers that creates a standardized mapping between business processes and data that is efficient to maintain and interpretable by AI.

Expand full comment
31 more comments...

No posts