14 Comments
Dec 15, 2023Liked by Benn Stancil

"In other words, had Databricks acquired an ETL or BI tool...". Databricks did acquire an ETL tool? https://www.databricks.com/company/newsroom/press-releases/databricks-agrees-acquire-arcion-leading-provider-real-time

Expand full comment
author

That's fair, though I'd definitely consider that 1) eating a smaller fish, and 2) it's not adding a new product to sell. It seems like that's more about just making it easy for enterprise customers who have DB2 or whatever migrate to Databricks. Had they gone after something like Airbyte, that would feel like a new line of business.

Expand full comment
Dec 12, 2023·edited Dec 12, 2023Liked by Benn Stancil

I think I have mentioned this before - but I struggle to be fair with Microsoft about any of their data products. I think it’s probably from the weeks of my life I can’t get back that I spent adjusting formatting in SSRS reports. Or maybe it’s the promises of something new that looks exactly like something old... say SSIS vs Azure Data Factory. All of this aside - Microsoft is brilliant at creating what the enterprise wants or atleast selling to the enterprise and telling them what they want? They consistently amaze me at complex solutions for simple problems. But selling to integrators is the exact reason for complexity, plus people’s ambiguous desire for customization and control. So.. at this point I’m just anxious to see a truly net new product using AI. Maybe I missed that announcement... but haven’t seen anything yet.

Expand full comment
author

On Microsoft, yeah, I get that feeling - and yet, they know how to sell it. I've never been able to figure out if that's because people are Stockholm Syndrome'd into the Microsoft aesthetic (it's what they've always had, and feels like home), or if Microsoft just understands what people actually want better than the rest of us do.

The "will AI make something truly new?" question is also interesting to me. It does feel like we've spent a lot of time talking about its potential, and people have spend a lot of time creating the picks and shovels for the impending gold rush, but there's not a lot of gold out there yet.

Expand full comment

Ya - and I don’t see a net new data tool from Microsoft yet. But I think that makes sense and they won’t be the ones to create it first (as far as an AI tool). They need to watch others learn from their mistakes and watch lawsuits play out first...

Expand full comment
author

Which, yeah, if you can play the wait and see game, and then go out and either buy the winner or just copy and it give it to a 100,000 person sales team, that seems like the right strategy.

Expand full comment

"we still have to figure out if AI is useful, after all"

It seems we are in the Microsoft vs Apple computer 1990's era; I would be curious to have your idea on what's next ?

Like, ok those systems make great hallucinations, they can understand texts, videos, audios, etc. (Gemini announcement is thrilling). But what about the real problems ?

With those models, it sometimes sounds like all the "data governance, lineag, catalog, contract" thing is dead in the bud. We just killed those challenges as now the computer "understand" the data.

Maybe the question is better worded: what will be the next big problem? IMO it's not AI regulation or things related directly to those models, but more the consequence of their use on market, business and human cognition. But there is maybe a step in-between 🤔

Expand full comment
author

I'm with you that the most likely problem is how the world changes when these sorts of models are all over the place. It's not really regulation (or, like, AI killing us all), but more about the weird ways in which a lot of our assumptions about the world change. It's most akin to social media to me, where people being able to talk to each other, that quickly and at that scale, created all of these unexpected consequences.

I do still think it's a bit of an open question how *useful* it is though. My bet would be yes, but so far, most of the things it's being used for are somewhere between making nice demos and nifty conveniences that speed up some tedious tasks. But is it being used to do stuff that we couldn't have done at all before? I don't actually know of any examples of this.

Expand full comment

Where do you think data quality falls in all of this? It seems like an obvious small fish, but could turn out to be a time bomb. It reminds me of the 35W bridge that collapsed in Minneapolis. A design flaw that put far too much stress on just a couple of bolts got ignored for years because... the bridge was still bridge-ing. Until it wasn't.

Expand full comment
author

This is maybe a weird opinion, but I'd say it doesn't matter?

If you're doing data quality for traditional business data and reporting, and you want observability or contracts or whatever the term of the day is, that's fine; I think we'll keep slowly building stuff for that, and we're in a constant two steps forward and two steps back pattern with it.

But I think the arguments that AI models make data quality more urgent are actually backwards. For most AI models (and LLMs in particular), you essentially stuff huge amounts of data into them and they kind of spit their averages back out, with some randomness around it. So long as you don't have huge systemic issues with data quality, messy or missing data doesn't really matter there that much. Like Andrej Karpathy said on Twitter, the point of these models isn't precision, but to be creative: https://twitter.com/karpathy/status/1733299213503787018

If we get bitten by data quality problems from using these models, I don't think that's a data quality problem; it's a problem with how we're using them.

Expand full comment

I'll give you this - it is a weird opinion. If the intent is to use these models creatively and not precisely, we need to do better at stopping people from taking the output as "should-be gospel". And I think we've already passed that point - LLMs are already getting baked into products in areas where precision is sought after at an alarming rate.

Expand full comment
author

Yeah, I very much agree with that - I'm not an expert at all on this stuff, but it feels like we're trying to use them as agents that do exactly what we want them to, when in practice, they're not very good at following directions? Like, we shouldn't think of them as robots but as kind of petulant interns are have a lot of raw talent but think they're a little too good for the job and will do it the way they want to.

Expand full comment

Benn, I'm interested in your viewpoint regarding Fabric's likely influence on Snowflake. They seem to have comparable visions and value propositions, especially with Microsoft's AI/Copilot being a part of the equation. What are your thoughts?

Expand full comment
author

Eeeh, I don't think it matters that much? To me, things like Copilot or Snowflake's LLM SQL thing are mostly just gloss. For enterprise buyers, I suspect the decision to buy Fabric or Snowflake (or Databricks, or whatever) is mostly driven by how well the tools integrate with whatever other systems you have, and what sort of workloads they can handle.

In that sense, every vendor has been moving toward two things: Data lakes that can connect to data where it sits rather than being classic databases with internal tables and all that; and being able to use databases for more than analytical work, and using them for operational uses, or AI development, or whatever.

Fabric pushes Microsoft more in those directions. None of that is new to the market, but it does solidify the trend. So to the extent that Fabric affects Snowflake, I think it does it by making these sorts of evolutions more of the expectation.

(One place where there's is some difference between Databricks (and now Fabric) vs Snowflake is that the former two seem to be betting that databases will do analytics + AI infrastructure, and the latter is betting that databases will do analytics + operational/transactional work. The AI bet is the hot one now, though who knows which wins.)

Expand full comment