benn.substack

Feb 22

I've been thinking about this a lot recently, and something about the general thesis seems....off? On one hand, the economics of building software are changing a ton, and honestly I think people might be underestimating that part. But on the other hand, I'm not sure economics are as important as people assume they are. They are one reason why you buy software, but the other is because making something that's good is hard? Not technically, but it's hard to craft the thing?

It's not a perfect analogy, but I can make a movie with Sora or songs with Suno, but I can't make real bangers? Low cost makes people with good taste much more capable, but it doesn't make *everyone* more capable.

Expand full comment

Jakob Saugbjerg Kristensen

Sep 25

Exactly. Plus, it misses the eternal point that unless you're selling commodities like gravel or crude oil, pricing of a product or service is not a direct function of the cost of production. You price a non-commodity based on the value it creates for the customer.

Expand full comment

Emily Ritter

Feb 21Edited

bro, your survey is missing some key people and/or an "other" field that isn't the monkeys; what good is bad data?!

Expand full comment

The monkeys / other!

(The list was from the HBO site for the show, which seemed like the most fair way to do it? I was surprised the guy from Seasons 1 and 2 wasn't there, but I can't be putting my own suspicions on the scale https://www.hbo.com/the-white-lotus/cast-and-crew)

Expand full comment

Emily Ritter

That link is organized by “introduced in” !!! There’s foul play afoot

Expand full comment

But Belinda!

Expand full comment

Luke Ambrosetti

>In other words, the whole warehouse-native idea might’ve been a bad one, but not entirely. If the world gradually coalesces around common data storage frameworks, people might not rebuild applications on top of databases but on top of data formats. A BI tool could run a DuckDB engine and connect it directly to SAP’s Iceberg bucket. A marketing tool could read from Stripe’s bucket. The next AI BDR could read from a CRM’s bucket. It’s not quite one customer list to rule them all, but it’s a bit closer.

Before warehouse-native was a big thing in MDS, this was already in place with Snowflake Data (and to some degree, Databricks Delta) Sharing. The only thing that's changed now is the popularity of Iceberg and open table formats in general.

Warehouse-native became popular because it helped reduce COGS on the service provider - "we'll use your storage and compute instead of maintaining ours"; however, there were some financial disadvantages to this approach as well (i.e. less revenue). This model also aligned quite well to a data platform sales reps' incentives.

From what I can tell in the SAP PR (not the DBX one), all SAP is doing is building their "new" embedded analytics tool on top of Databricks, and then allow for "bi-directional" sharing of data between joint customer accounts. Companies have been doing this for years on Snowflake, just not explicitly Iceberg.

There's a problem with this approach though if not implemented correctly. For it to work at scale, SAP will need to have compute and storage in effectively every region their customers are in, otherwise egress costs (for both parties) get in the way. It's absolutely possible, but if I'm a customer, I'm not buying it if data is being queried from a different CSP or region.

Expand full comment

Yeah, I'm with you that "sharing data via files" isn't exactly a revolutionary idea. People have done this before with basic FTP, or CSVs, or whatever. Some standard framework is what makes it work though, to the extent that it could (which I'm not at all sure it can).

I think that's right about SAP, but even if it is, that's sort of the point? If they are building something underneath Databricks, they're doing it by using some storage format that other engines could use too. Sure, they have all the UI hooks with Databricks, but a giant bucket of Iceberg data could just as easily be hooked up to Snowflake or anything else. Which is why it seems more interesting than just "we are sharing your SAP data with Snowflake through their old sharing features" - because that version supports one engine, and this version supports lots of them. (Or, cynically, that's also the other reason they might've done it with Delta; because Delta is compatible with other data formats, but Snowflake et al isn't compatible with Delta, so it has the appearances of being open but isn't quite.)

One of the blog posts said that they can deploy it across different clouds, so I'm assuming they will likely support other regions too. But also, to your point, that complicates any sort of cross application sharing.

Expand full comment

Clivado

Feb 22

Databricks' entire marketing strategy is "the appearances of being open but isn't quite."

Databricks' Uniform feature means that they could use either Delta or Iceberg (or both) but I guess they'd use their own Delta format to lock customers in. Delta is 'open' but has more/better features if used through Databricks' own tooling.

Expand full comment