“ChatGPT is having an iPhone moment.”
When ChatGPT came out late last year, people immediately began comparing its launch to that of the iPhone. It’s a tempting analogy. AI could be the biggest technological breakthrough since the mobile revolution. Both launches immediately captivated the public’s attention. And just as the NBA is constantly looking for the next Michael Jordan, the tech industry is always hunting for its next Steve Jobs.
Just yesterday, OpenAI, the maker of ChatGPT, took another apparent step towards the iPhone: They launched an app store. ChatGPT now supports plugins, which are apps that run directly inside of the chatbot and allow it to interact with other services on the internet, like OpenTable and Instacart. With these apps, people can use ChatGPT to make reservations, order ingredients for a given recipe, or do a handful of other similar tasks.
It’s a bold step—but it feels like either a mistake or misdirection. Because public AI providers like OpenAI aren’t destined to become the next iPhone, but the next—and maybe, much bigger—AWS.
The internet’s hardware store
Cloud computing wasn’t one revolution, but two.
The first was architectural. Prior to “the cloud,” most software was bought off shelves, installed on computers, and run entirely on customers’ own hardware. It was Microsoft Word for Windows 95: Sold at CompUSA, in a box, on a CD. Cloud software, by contrast, is delivered over the internet. Rather than installing an entire program on your computer, it runs elsewhere, and users interact with it remotely. Though you sometimes still have to install software too—like the Dropbox widget on your computer or an app on your phone—the majority of the service runs in some data center somewhere. Instead of Word, it’s Google Docs: Accessible on a website, no download required.
Revolutionary as this concept is, it’s not actually all that transformative on its own. Because, in addition to developing the applications they wanted to sell, cloud software vendors also had to run them. They had to buy servers. They had to hire people who knew how to manage those servers. They had to run software on those servers that ran the software that they sold to customers. They had to keep the servers up, 24/7. They had to figure out contingency plans for when a server fried itself, or the power went out, or someone accidentally ran a command that caused those servers to stop announcing their DNS prefix routes through BGP. For many companies, these messy realities made the theoretical promise of cloud software impractical and expensive.
But in these problems, Amazon saw an opportunity, and an incomprehensibly large pile of money. Amazon—and later Google, Microsoft, and a few others, who are collectively now known as cloud providers—began offering ways for companies to lease servers. The cloud providers would make the upfront investment to buy a bunch of computers, and would do the work to make sure they were always up and running. Companies could then rent them, by the minute, for a fee.
To make the offer more appealing, cloud providers started selling utility services as well. In addition to renting hardware, people could also lease a file storage system, a database, a tool that runs simple programs on demand, and hundreds of other similar products. All of these services were designed to be a kind of middleware that sits somewhere between bare metal and the sort of software that most people use every day. The utilities are building materials, and cloud providers are the internet's hardware stores—they sell pre-cut lumber and boxes of nails and sandpaper of dozens of different grains, but don’t offer birdhouses or lawn furniture or two-story houses. They leave it to other people to build, market, and sell the thousands of finished products that their raw materials can create.
It has been, in what's still probably an understatement, a staggering success. The combination of cloud architectures and the affordability and convenience of AWS and its utility services launched a revolution. Tens of thousands of companies were created on the platform. Hundreds of thousands of new products got launched. Millions of engineers experimented with cloud technologies and stretched the limits of what they could do. And roughly a trillion dollars ended up in the bank accounts of the major cloud providers.1
Yes, there are skeptics and holdouts—some companies don’t like the idea of running sensitive applications on another company’s hardware; for very big companies, the fees that cloud providers charge can end up costing more than buying their own servers. But these exceptions are uncommon. For many companies, their AWS (or GCP or Azure) bill is an unavoidable tax for doing business on the internet, a universal line item on our income statements, the toll to drive on the information superhighway.
For end users, cloud providers are the internet’s invisible backbone. Nearly all of us rely on them, daily, and in countless ways. Their reach is often only appreciated when they go down, and take half the internet with them.2 They are, true to their name, an ever-present cloud over modern society.
The generative cloud
So here's an obvious prediction: AI will follow a nearly identical trajectory. In ten years, a new type of cloud—a generative one, a commercial Skynet, a public imagination3—will undergird nearly every piece of technology we use.
In the same way that cloud architectures predated the cloud providers, deep learning and neural networks have been around far longer than AI applications like ChatGPT. However, for most companies, these technologies are too impractical to use widely. They have to be developed by expensive experts, they’re hard to integrate into software applications and business processes, and they don’t deliver clear enough benefits over more basic techniques—like division—to justify the cost. For years, the AI-powered organization has been coming; we just have to figure out how to use AI first.
But a million companies’ problem is one company’s opportunity (and another very large pile of money). For better and for worse, OpenAI—and specifically, its APIs—will finally take AI mainstream. Rather than training their own models, companies can now use generalized large language models offered by OpenAI.4 The explosion of GPT integrations—all developed in a few months—speaks to how broadly useful universal LLMs are, and to how easy they are to build on.
Just as cloud providers built out hundreds of utilities that are all underpinned by core services like EC2, I'd expect OpenAI to do the same thing on top of GPT and other foundational models. They already offer a chatbot, a speech-to-text service, and a text-to-image service. Surely, more utilities like these are coming: Text-to-video, video-to-text, text-to-audio, text-to-code, image-to-text, code-to-documentation, detection services to figure out if something was created or altered by an LLM, music generation, software generation, pipes between these services, and dozens more.
These products won’t be end-user applications, but developer tools. If you want to build on top of them, it's a simple API call. Ask the ChatGPT API a question, and it’ll talk back to you. Send an image to it, and it’ll describe what it sees. Pass it a codebase and a desired change, and it’ll send you a new codebase with the requested feature. And give all of these models temperature parameters, content moderation settings, or other simple tuning dials. We’ll manage them with Terraform, and, if history is any guide, spend a lot less time on model development and a lot more time trying to figure out how to configure OpenAI’s API Gateway and IAM services.5
If this happens, public AI providers like OpenAI would become another backbone for the internet. Nearly every piece of technology will rely on their models. Outlook will need them to summarize our emails. Github will use them to automate code reviews. DoorDash will need them to help guide you through your order. Delta will depend on them for booking flights. Facebook might not be able to open doors without them. But, as is the case for cloud providers, this critical infrastructure will be invisible to most people. Customers won’t know or care which products use GPT, just as they don’t care which ones use DynamoDB or Spanner or Azure Functions. They’ll just come to expect that the products they buy to do the things at AI can do.
The race, then, is to be a dominant AI provider, since—again, as is true for the cloud—dominance is self-reinforcing. The bigger a provider becomes, the deeper its moat gets through an entrenched ecosystem, better models, and, likely, lower prices. And because training and running LLMs is very expensive (like building data centers is expensive), once a few AI providers separate themselves from the rest of the market, nobody else can catch up.
The final equilibrium is the same as it for the cloud providers: A few companies win the market, and the rest of us come to accept their bills as the cost of doing business.
Of course, there will also be skeptics. Some companies will resist using public AI providers because of concerns about security or privacy. Other companies will get big enough that it’ll be cheaper for them to develop their own models than it is to rent one from OpenAI or Google. And there will probably be “multi-cloud” approaches, where companies let their customers choose which LLM they prefer.
We’ll also have to grapple with one very messy issue that cloud computing can ignore: AI is opinionated. Though today’s cloud providers have tremendous power, it’s almost entirely economic. Adam Selipsky and Thomas Kurian can extract rents, but EC2 and Google Compute Engine can’t outright manipulate us
Public AI providers can do both. If nudging Facebook users towards more positive or negative content can change their emotions, imagine the effect of public AI providers turning up the temperature on their core models. That single parameter could control how polite or rude we are to each other in billions of emails and text messages. Other parameters could turn every company’s support staff into agents of chaos, or embed political bias in every generated piece of text.
It’s a terrifying amount of power—far bigger than Elon Musk controlling our Twitter feeds, far more direct than TikTok putting its thumb on its algorithmic scales, and far more precise than Russia’s disinformation campaigns. And I have no idea what to do about it.6
…or not
With all that said, ChatGPT’s plugins feel like a step in a different direction. On one hand, everyone got very excited about them, so maybe they’re a great idea. Plus, in the last twenty years, there are only two tech products that have been more successful than AWS—the iPhone and Google search—and OpenAI seems to be chasing both of them.
On the other hand, it strikes me as a risky bet for OpenAI. Plugins—and ChatGPT itself, for that matter—position OpenAI’s products as apps that people should log into and use directly. ChatGPT’s staggering user numbers have already become its public benchmark. The deafening buzz around everything OpenAI does—every new release is a revolution; every blog post is a revelation—could become an addiction. Google going DEFCON 1 over ChatGPT could further bait OpenAI into more fights for user attention.7
People’s attention, however, is a scarce and competitive commodity. In order for that business to get anywhere near the scale of Google or Apple, OpenAI needs to become the front page of the internet for billions of people. Though that’s not impossible, there are a lot of big companies vying for the same screen time.8
The more lucrative opportunity for OpenAI, it seems, is to sit behind the apps that are fighting for our attention. In that scenario, whoever wins, so does OpenAI.9 Moreover, if AI can replace service jobs, public AI providers could be much bigger businesses than the cloud providers. For OpenAI to be truly ubiquitous and to truly “benefit all of humanity,” ignoring how many people use it directly may be the most important thing they can do. The real war isn’t for users, but for the public imagination.
Over the last ten years, AWS has collected about $290 billion in revenue. AWS has consistently represented about a third of the cloud provider market, implying that the cumulative spend on cloud services over the last decade is about $1 trillion.
Speaking of things that are too important to fail, what would happen if Amazon went bankrupt and had to shut down AWS? Or if they just decided this wasn’t worth it anymore, and turned it off? If the banking system is too big to fail, the same is almost certainly true for the public cloud—not least of all because the banking system would probably fail without it.
I’m sure we’ll end up calling this something dull, like the AI cloud, or the generative cloud, or the public mind, or the public brain. But my vote is for the public imagination, because it captures the expansive potential of AI and the dystopian possibility that it actually replaces human imagination.
Yes, this conflates things a bit. A lot of existing AI models are things like bespoke fraud detection tools, which LLMs can’t (yet) replace. However, in ten years, I’d expect AI to be in far more places than it is today, powering a much wider range of applications than AI does today. And most of that infrastructure will be backed by companies like OpenAI.
As a longer aside, a new role recently emerged in the AI froth: LLMOps. Some people say that this is just a buzzy new name for DevOps. I disagree, at least in the short term. One of the weirdest properties of LLMs is that they can’t actually be directly engineered the way software can. I tend to think of any computer program as having both a user interface and a hood that an engineer can pop to precisely control that interface. If you want an LLM to respond in certain ways, for example, can’t you program it to do that? The answer, it seems, is not really. The only way to get it to take the actions you want it to take is to talk to it. In this way, it is kind of human—there are no dials that will reliably control exactly what it does. If we want to do something, we have to persuade it to.
That means that prompt engineering isn’t some hacky way for non-engineers to control an LLM; it’s the only way to control an LLM. Given that, LLMOps—which involves developing new techniques for getting LLMs to respond in reliable ways—seems both necessary and very different from today’s DevOps roles.
Over time, however, I’d expect OpenAI to provide utilities to make different methods of prompt engineering easier (e.g., rather than having to chain prompts together manually, OpenAI offers a service that does it for you). If that happens, LLMOps would probably start to look a lot more like a specialized subfield of DevOps, instead of some bizarro engineering role that’s responsible for finding new conversational tricks to socially engineer a computer.
Or, maybe OpenAI is baiting Google to defend search and not GCP. Either way, it’s curious to me that Google responded so aggressively to ChatGPT and Amazon didn’t.
Although, those of us in the United States may have a lot more free time soon, particularly in bed between midnight and 3 a.m.
Though it’s possible to be both AWS and the iPhone, that’s a very tall order. As Steve Yegge suggested in his famous memo about Google and Amazon, you can be a great platform or a great prodcut. Even companies as promising as OpenAI can get captured and pulled apart by their customers.
Really enjoyed this post. I was wondering, as I read it, "Why not both?" AWS's origins were serving some of Amazon's consumer-facing business lines. Amazon was (is?) AWS's first and best customer, which ensured immediate demand for its most advanced features.
What would prevent a similar framing for OpenAI? Ie consumer-facing offerings in the short term push the backend forward as fast as possible, then rolling those features out at the platform level makes them widely available for recombination by developers. (And in the short term, keeping things more closed allows OpenAI to get a better handle on safety.)
This is a great read, Benn. But the logic breaks down when training and running an LLM stops being as expensive as building data centers :).
I am curious to hear what you think about OSS LLMs like Databrick’s Dolly that can do instruction following, brainstorming, and summarization just like ChatGPT.
BUT….
can be created for $30 using one server for 3 hours on a small dataset using a 2 year old open source base LLM as opposed to training for 100,000 GPU hours at ~$10 million price point.