benn.substack

That's fair, though I think those two things are intertwined. The mistake wasn't strictly failing to market a product that could've been a big, dumb database; it was not seeing that they were a couple steps away from building a big, dumb database that would've been really valuable to build. It seems like, in both the product and marketing, they were tied to this grander vision.

That said, to your point about Snowpark, that's Databricks' opportunity now. If they sort this stuff out, I think they have a higher ceiling than Snowflake, but to get there, they've got to sell beat Snowflake on the meat and potatoes "I need a database" deals that Snowflake seems extremely proficient at selling.

Expand full comment

Amit

May 27, 2022

This argument feels akin to claiming Airbnb should have just been another OTA just marketing hotels. Databricks is helping increase the TAM by making AI/ML easier and more accessible while also cementing their position well-ahead of others in the area. They are helping enable new use cases rather than just replacing vendors for existing ones. Similar to how Airbnb is expanding into Hotels, Databricks is now (for the past 2 years) expanding into general analytics and "boring" database stuff. If you [Databricks] are in this for the long-run, it seems like the smart approach to me. Do you disagree?

Expand full comment

May 29, 2022

I do. If you've got the ability to be 1) a much better version of the thing that people already have, or 2) something that entirely new that people don't quite understand, I think 1) is a better path. Solving new problems is a tougher sell than something that people already know how to use, assess, measure, and implement.

On the Airbnb analogy, I'd see that differently. Airbnb did market directly to people who want hotels. It wasn't a new use case; it was a new form for the same problem. Had Airbnb done what Databricks did, they would've started with something like Experiences, which would be closer to a "new use case rather than replacing an existing one."

Expand full comment

Andre

Oct 12, 2022

De-fi possibilitou trafegar pela blockchain mais rapidamente para construir em varios conteiner e varias linguas

Expand full comment

Ian Thomas

Apr 12, 2022

As someone who has been in the data industry for a long time, and who spent the year between 2012 and about 2018 feeling vaguely stupid much of the time for my inability to mentally stitch together the myriad Big Data technologies that were constantly emerging, merging and disappearing during that time, I find this post to be extremely soothing. Perhaps there is some kind of entropic data tech law that dictates that, eventually, all data tech becomes databases?

Expand full comment

Apr 12, 2022

Thanks! And I suspect there's something to that, where most products eventually collapse down into a handful of things. At the end of the day, we're all either building a database, a data pipeline, or a BI a tool, no matter how much we say our thing is different.

Expand full comment

Jillian Corkin

did someone say tarot? 🔮

Expand full comment

Kai

May 27, 2022Edited

Love the thoughts and agreed with the structure of relevant prior art.

2 thoughts:

1) Doesn't it make sense for Databricks in 2015 to be "a better Hadoop" for companies with Uber or Pinterest-sized data, and Snowflake to be "a better Redshift" for companies with smaller-sized data? In that Venn Diagram, there are some companies that cross over, but many won't for dozens of years.

2) What are your thoughts on what role these tools will play in the next shift to a better architectural pattern (aka Data Mesh)? This arch evolution is being driven not by tooling but by internal org structure / drift in knowledge management. It's why imo data catalogs haven't worked; the organizations haven't rly iterated to produce a novel org structure capable of maintaining data.

Expand full comment

May 29, 2022

On 1), that would make sense if Databricks could actually scale better than Snowflake, but I don't think that was the case, at least not in a meaningful way. So Snowflake works for people with both small and big data. Plus, if you're market is uber sized data, you can't sell to that many people. The boring masses is a much bigger market than a few cutting edge companies (that are also inclined to build internal solutions for their very specific use cases).

On 2), I think both Databricks and Snowflake help there, because they make data centralization actually possible. That doesn't mean the whole data stack should be centralized, but starting from a centralized core and fanning out is almost certainly easier to manage than some loose network of departmental data tools.

Expand full comment

Vamsi Krishna B

May 27, 2022

Clickhouse says Hi

Expand full comment

Gareth W.

Apr 13, 2022

You missed out Azure Synapse as Microsoft's potential alternative to Databricks. It's (currently) still behind Databricks in terms of some key features, and the cost for a dedicated SQL Pool in Synapse is still a bit hard to swallow, but MS is moving fast. The Synapse team is working hard to make it super easy to use for young / small Analytics teams. It will be interesting to watch how the Azure Synapse / Databricks relationship evolves over the next year or two.

Expand full comment

Reply (2)

Stephen Pace

Jun 10, 2022

Synapse V3 has been "coming soon" for 3 years now. Customers are getting angry.

Expand full comment

Jun 10, 2022

they mean soon in a geological sense

Expand full comment

Apr 14, 2022

Yeah, I imagine a lot of the partnerships in the space start to become a lot more standoffish. That's already happened some with Snowflake and AWS, and could see it happening with databricks and Microsoft.

Expand full comment

Joe Reis

Apr 9, 2022Edited

Great read. Today I was just chatting with one of these companies you listed, and mentioned a few things that echo your points. First, communication and marketing are everything. From day one, Snowflake knew how to sell to the enterprise. This cannot be understated. Their growth is directly related to knowing what enterprises want, and delivering it in a way that's stupidly simple to understand. The "high IQ" vendors somehow struggle with this. As my old boss said, "when the customer wants to buy, shut up and take the sale." Second, Big Data died many years ago, and the companies still pitching it are like the zombies in Walking Dead that are getting brained left and right. Third, the dark horse the big incumbent DW/DLH vendors need to watch out for is the "live data stack", where applications, real-time, next-gen OLAP, and ML have a seamless feedback loop that basically nullifies the existing MDS paradigm. That's coming...

P.S. Longtime Spark and DB user since 2014, so very familiar with its evolution

Expand full comment

Your last point is why I think Databricks could win this whole thing, if they figure out your first point. They have more capacity for being high-ceiling data science/ML/application infrastructure, but they have to make sure that doesn't get in the way of making the simple sale.

Expand full comment

Peter McNally

Apr 8, 2022

As usual, I enjoyed reading the article.

However, the main point i am getting from this article is that the mistake Databricks made is around sales and marketing. That has never been an issue for me. The initial hype from the demo drew me in when I attended Strata back in 2015. I set up a POC immediately and thought it was amazing, but didnt touch it again for a couple years. Fast forward two jobs and many Hadoop headaches later and I gladly jumped back in to it.

I hate empty 'solutions-oriented' pitches as much as anyone, but I do like the unified analytics platform they promote. I currently work in an organization with a small data staff. Having data science and data engineering in the same platform works really well. I also just really like working with Databricks. The notebook structure is great. I like being able to switch from SQL to python and (rarely) R/Scala. Scheduling ETL jobs is simple (it's just a notebook!). Being able to develop machine learning models on the same platform is key for us too. Databricks support has also been great, especially considering we do not spend much with them.

Full disclosure, I have never used Snowflake, or dbt for that matter. I know those are quite popular right now. I am definitely curious, but I just don't have an opportunity to use them. I also don't see a need. Is there any reason other than the sales/marketing pitch that you prefer Snowflake/dbt? Cost? Simplicity? Functionality?

Thanks

Expand full comment