Climate change comes for us all.
The rise and fall of Lotus Notes in the late '90s/early '00s offers an enduring cautionary tale about the dangers of doing too many things and being caught out in what we might call a "disaggregation wave" - the point in the tech cycle where vendors have bundled too many things together and the mood changes to favor componentized approaches. Notes started in the mid-90s as an interesting and fairly unique app/database platform which combined a semi-structured document DB with a formula language-based programming environment. It enabled a class of document-centric business apps (from discussion boards to expense approval workflows) that were damned useful, and spawned its own third-party app vendor and custom developer ecosystem. It was, and I say this without any shame, pretty cool.
However, as the late '90s wore on, a couple of things happened: Email became incredibly important, and so did the web. So Lotus bundled email into Notes, and also added an http server & JVM into the Notes server, renaming it Domino (don't ask me why, I don't know either). These were all obvious enhancements to make at the time, but they caused a couple of things to happen: First, they put Notes in Microsoft's crosshairs, who was trying to get folks to adopt Exchange; but more relevantly for this discussion, it put Domino in competition with the (then) very disaggregated tech stack around web applications. Companies building their cool web startups in 1999 were not about to deploy a whole bunch of Domino servers - they were bolting together Apache http servers with some kind of SQL backend and spinning up their own server-side code execution tech. The bundling of stuff inside Domino just didn't sell; and it didn't help that Domino didn't do any of these new things tremendously well. By the early/mid-2000s, these twin forces had more or less killed off Notes.
So every time I see a software company starting to bundle more and more into their platform, I do wonder what that tipping point might be - the point at which, almost out of perversity, customers say, "no, I don't want to buy this all-singing all-dancing solution, because I want the satisfaction of putting it together myself".
Not so long ago I worked at a company where their data warehouse was running on a dedicated four node SQL Server cluster hosted on AWS that was costing north of a quarter of a million dollars per year in Microsoft licensing alone before we had run a single query. When people tell me they think Snowflake is expensive I ask "relative to what?" and then walk away laughing.
I would not be that bullish on any of the outcomes regarding Snowflake. Here is why:
- Yes, they spent lavishly on customer acquisition. We all know this. The way they were overpaying for Sales Talent. The way they treated their prospects to VIP Dinner Events. Naturally, you would expect this to eventually eat into Margins
- But, they understand this no doubt, and are making investments/acquisitions towards a platform play. Streamlit was the first major move towards this. There is also an investment in Hex (and even an investment in DBT)
With the platform play, economics can change. In the ideal scenario, they acquire a tool with a decent audience, but poor monetization. The challenge right now is valuations. Things haven't changed much. But eventually they will. And 1-2 years from now we might see >100 acquisitions by Snowflake. It would be a sensible thing to do. Does that put them into competition with AWS, GCP. It is possible. But maybe that was never the competition. Maybe the real question was always:
Who is more likely to have a future: Snowflake Cloud or Oracle Cloud? And which brings me to: Oracle Market Cap = $200B, Snowflake's = $55B. Still plenty of space to compete for...
"Last fall, Erik Bernhardsson made the case that AWS and other cloud providers might be happy to sell core compute services like EC2, and let other vendors—Snowflake, for example—do the hard work of building, marketing, and distributing applications on top of it."
I don't agree with this argument. When both margins and the market are high, the incentives to stay on the sidelines and only provide the hardware layer are not there. AWS consistently copies available open source competitors in the data products layer and so does GCP.
I do believe that there is a lot of room for innovation by new incumbents, but they have to follow the advice from Frank Slootman himself: start amping up sales when you check majority of the boxes around product features. see https://stassajin.medium.com/review-of-amp-it-up-f433ae2bbb3e.
Yeah the consumption model can be a nasty surprise when the bill shows up, but CXOs will figure that out and put guardrails in place. Snowflake's genius is its data cloud & marketplace model, which is easy for citizen data scientists and business people to understand and spin up in ways they could not do with earlier generation data warehouses and analytics tools. My guess is that most businesses leverage only a fraction of the data that goes through their systems. So, what is Snowflake's ROI? If it costs 2X but you get 4X value, then the business case looks good. I recently wrote a blog post on the last days of the legacy data warehouse. The question is, how quickly can IT teams get their data from an old DW into a new data cloud? It's not fast or easy.
Hi Benn, how would you put the data catalog category in products which are easy to buy? I have seen data catalog inflated evaluation criterias motivated by we want a confluence cum social cum data governance cum observability cum integrations cum consumption layer cum data discovery cum search demands? With some blame at vendors encompassing their offering as platform for everything approach. Is there a way to be called even innovative in this slugfest?
I have an interesting anecdote regarding DBT on same idea. We love the product, but we think DBT pricing and value communication is a bit of hassle, for one of the info-sec enterprise crazy use - option of DBT was either build (host your own) or enterprise offering. But it's pricing is confusing (50K aws marketplace costs for 10 licenses a year vs 50 dollars monthly for standard). DBT yet is not a product but a way of working. The I-wanna-avoid-sales, cum price approvals, cum databricks heavy base made us use dtaabricks' delta live tables (a jumbled half of what DBT and Great expectations do together with python added to mix) instead of DBT
Hot take, the second pillar is maybe neater as a feature than a standalone too. Might be an answer to your perplexing first footnote
eg the recently repurposed https://cloud.google.com/dataform
More than possible. I’m pretty sure I can merge this with CloudFormation somehow to do exactly that in YAML
> This principle—be the warehouse for the modern data stack—could be extended to more fundamental characteristics of the database
Dang it, you went and told everyone my new business model: becoming the thin integration layer over commoditized compute and storage, that makes it trivial for end users to build their own SaaS-like apps.
Fortunately, as with all great ideas, I am confident nobody will believe you (or me).
Upper case is the SQL standard default unless qualified in double quotes. This is standard in most RDBMS, SQL Server being a notable exception in not following SQL Standard.
The relational database is the most overleveraged, overexposed services sector tool of all time, and that the relational database with hand-tuned schema and queries is optimal to automated machine generated schema and queries is one of the biggest shell games of all time - it's gone on for years.
I'd argue it defines IT and most operational roles these days.
The proliferation of the relational database and relational data schema was developed in the early 1970's in order to service and capture share of the transactions created by the US economy's shift to full-on services and a debt instrument based economy.
Some really interesting reading if you want to get really heady.
IBM's memoriam of the founder of the relational database:
WTF Happened in 1971:
What is really funny about the IBM memoriam is that the same arguments were held in the early 1970's about query optimization, what to store, etc. Sounds like a lot of the discourse of the last 2-3 months around Snowflake, among others...
Remember, the default Unix timestamp is 00:00:00 UTC on 1 January 1970.
The entire concept of the relational database is that it developed to support the debt-serviced, equities-laddered, decline-of-Bretton-Woods services growth of services businesses servicing other services businesses after the early 1970's - when the US and Western economy as a whole shifted away from a manufacturing economy to a full on services economy with complex financing.
As such the database and any derivative products are overexposed to the services market as a whole.
When you have services companies parading as tech companies, and getting valued like it with cash thrown at them for equities, we end up in the state we are in today, which is databases servicing other databases made by unprofitable companies servicing other unprofitable companies, with add ons to fan cloud compute that are other services businesses making point solutions off other databases under the hood of their data products.
The database has always been rent-seeking, and on the cloud that is turned up to 11.