You hit the nail on the head on the commoditization of AI, and the importance of non-technical specs to how companies are successful. Apple users pay a lot more for less computing power because they want the ease of an Apple product. Likewise, if there were a "better" search engine, would people stop using Google? Probably not. OpenAI has the first-mover advantage, with "I chat-gpt'd something" becoming a verb like "I googled something." That being said, these days, when you Google something, you get an AI summarizing the answer for you. That's their way of getting people to use their AI tools and keeping Google Search relevant. The key problem these days (it seems) is just taking all these LLMs, and implementing them without the chat interface and directly into a workflow, which is what Cursor/windsurf do (and why people pay for them). Understanding the use cases and how to integrate the LLM into it, so-called "agentic AI."
Yeah, I assume that's the both very simple and very hard dilemma Google's going through now. They have a thing that everyone uses all the time, which they could pretty easily use to eat into a ton of chatgpt usage. but how do you make something that can be both search and chat at the same time? How do you make a google search more like a google...inquiry? that's sort of search and sort of conversation?
years ago my youngest was struggling in kindergarten. I paid a fortune to all kinds of doctors and while the diagnosis in retrospect was obvious (ADHD) the recommendation cracked me up - the child needed 2 squares of carpet at story time not 1 - to give space for their body to wiggle and not disturb others. I always thought of that extra reading carpet real estate as the most expensive in all of the Bay Area. The bigger text box suggestion is #GOLD and should go down as an innovation breakthrough - give this concept a name, update your wikipedia and trademark it right away.
Useful is not the same thing as usable. You hit the nail on the head baby!
That’s why the LLM leaderboard halo effect doesn’t make it past those who already bookmarked huggingface. People want to use tools that feel good to use daily and that doesn’t map cleanly to overtraining a model to pass hyper specific benchmarks. Same story with why snowflake blew up. It’s not going to win the benchmark game. That was not the game worth winning. It had the best UX and story for old and new school data people.
And I think it's even tougher for these sorts of models, because it's so hard to even know what's good. There's so much subjectivity. I'd be really curious how much people's expectations of different models differs from how they rate them using tools like the huggingface leaderboard where they choose responses blind.
I think about that a lot. And, which part is the IP? Is the model amazing and better than everyone else's, and the website is just a way to distribute it, or is the website (and domain, brand, etc) amazing, and the model is more or less the same as a half dozen others, including open source ones?
Not that you asked but I use ChatGPT because 1) it was the first to market and so the first I used 2) the UI is very intuitive and 3) the memory is interesting; I'm not sure it alters the responses too much but it warms me to know that it ChatGPT has context of past conversations. So I'd rather not switch to Gemini or Claude. +1 in favor of brand and UI.
FWIW, I think Claude is probably a better model on the margin and Gemini would likely be more valuable to me based on my allegiance to the Google Suite (email, docs, etc). And yet I don't switch... at least not yet.
Going back to your article, I think it's true that some GPT wrappers can get wiped out quickly (think old pdf parsers), but wrappers that are niche to a business segment (dentists, accountants, etc) seem very viable.
Someone once made the joke that all of these wrapper companies are the forward deployed engineers of the three or four foundational model companies, and that seemed very right to me.
You hit the nail on the head on the commoditization of AI, and the importance of non-technical specs to how companies are successful. Apple users pay a lot more for less computing power because they want the ease of an Apple product. Likewise, if there were a "better" search engine, would people stop using Google? Probably not. OpenAI has the first-mover advantage, with "I chat-gpt'd something" becoming a verb like "I googled something." That being said, these days, when you Google something, you get an AI summarizing the answer for you. That's their way of getting people to use their AI tools and keeping Google Search relevant. The key problem these days (it seems) is just taking all these LLMs, and implementing them without the chat interface and directly into a workflow, which is what Cursor/windsurf do (and why people pay for them). Understanding the use cases and how to integrate the LLM into it, so-called "agentic AI."
Yeah, I assume that's the both very simple and very hard dilemma Google's going through now. They have a thing that everyone uses all the time, which they could pretty easily use to eat into a ton of chatgpt usage. but how do you make something that can be both search and chat at the same time? How do you make a google search more like a google...inquiry? that's sort of search and sort of conversation?
years ago my youngest was struggling in kindergarten. I paid a fortune to all kinds of doctors and while the diagnosis in retrospect was obvious (ADHD) the recommendation cracked me up - the child needed 2 squares of carpet at story time not 1 - to give space for their body to wiggle and not disturb others. I always thought of that extra reading carpet real estate as the most expensive in all of the Bay Area. The bigger text box suggestion is #GOLD and should go down as an innovation breakthrough - give this concept a name, update your wikipedia and trademark it right away.
if google wants to save themselves, all they have to do is pay a few million dollars to benn's big ol' text box™
Useful is not the same thing as usable. You hit the nail on the head baby!
That’s why the LLM leaderboard halo effect doesn’t make it past those who already bookmarked huggingface. People want to use tools that feel good to use daily and that doesn’t map cleanly to overtraining a model to pass hyper specific benchmarks. Same story with why snowflake blew up. It’s not going to win the benchmark game. That was not the game worth winning. It had the best UX and story for old and new school data people.
And I think it's even tougher for these sorts of models, because it's so hard to even know what's good. There's so much subjectivity. I'd be really curious how much people's expectations of different models differs from how they rate them using tools like the huggingface leaderboard where they choose responses blind.
The title has me dead! Great article as always. Interesting to think of ChatGPT as a wrapper around the actual IP
I think about that a lot. And, which part is the IP? Is the model amazing and better than everyone else's, and the website is just a way to distribute it, or is the website (and domain, brand, etc) amazing, and the model is more or less the same as a half dozen others, including open source ones?
Not that you asked but I use ChatGPT because 1) it was the first to market and so the first I used 2) the UI is very intuitive and 3) the memory is interesting; I'm not sure it alters the responses too much but it warms me to know that it ChatGPT has context of past conversations. So I'd rather not switch to Gemini or Claude. +1 in favor of brand and UI.
FWIW, I think Claude is probably a better model on the margin and Gemini would likely be more valuable to me based on my allegiance to the Google Suite (email, docs, etc). And yet I don't switch... at least not yet.
Going back to your article, I think it's true that some GPT wrappers can get wiped out quickly (think old pdf parsers), but wrappers that are niche to a business segment (dentists, accountants, etc) seem very viable.
Someone once made the joke that all of these wrapper companies are the forward deployed engineers of the three or four foundational model companies, and that seemed very right to me.