It seems that "high quality text" will almost need a clear quantification of "quality" in the future for accurate monetization to occur.
My 2024 brain has zero ideas for quantifying the quality of text across the web. I could start by saying heuristically that the average NYT article would probably be at least a 7/8 out of 10, while The New York Post couldn't be more than a 4/5 out of 10. But then, after major news outlets, you get into weird data sources like Reddit where r/wallstreetbets would score a 0/10 but a subreddit like r/personalfinance could score an 8/10, and then of course also vary by user and context.
As I'm writing this, it seems that only large organizations like the NYT would have the argumentative and lobbying/lawyer power to ever have their text monetized. And then how do you monetize it?
Interesting read and thought provoker. I originally thought the court case was a bit silly from the Times but now I agree with your point that they are just looking to "make money from that tide"
It seems like the OpenAI et al basically have that kind of scoring mechanism in their training weights. NYT is high, Wikipedia is high, Reddit and the internet is low, etc. I could see that getting more precise over time though, especially if the Times wins and they have to be more particular about how they're sourcing training data.
That said, I'm sure if that happens, the temptation will be to use LLMs to assess the training data. Which will turn the whole thing into an ouroboros that's eating itself twice - LLMs will start to get trained on LLM-generated content, and they'll score the quality of that content using LLMs built from roughly the same material. Which I dunno, maybe that works out fine?
I was just having a conversation the other day with someone who thought of an LLMs usefulness just being for what it "knows". We talked about an example of an LLM that is a company historian that knows what decisions were made and why they were made at your company. Just as soon as this idea came up we discussed the problems with a historian. Granted our problems were mostly unrelated to licensing - although privacy/HR was a potential problem. However - the biggest consideration was how companies would want to redact certain information, restrict other information, and completely forget a 3rd set of information. Those would be pretty difficult activities to achieve with 100% accuracy - and mistakes could be costly, dangerous, harmful etc. Plus what company (or person) actually wants a perfect memory of what happened - we choose to forget stuff all the time. :-) Anyways - all that to say I'm in the camp of "doing" being more useful than "knowing". However, I think part of making the doing useful is referencing external data through databases, apis, and web-crawling (like a search engine would). Therefore the knowing is less Oracle like and more search engine like. However if the Oracle route was chosen I agree - there is no way the thing would work without lots of copy-written material.
Yeah, that gets to a point I didn't really talk about here, which is that there's definitely a fuzzy between the two. Like, does a lawyer know things or do things? Both?
But, I think that's ultimately where the internal LLMs end up, is we treat them like lawyers. We'll still have to teach them a bunch of stuff - train them on internal data, jargon, whatever - but the primary purpose of that is to complete tasks for us. Which (and I'm totally guessing here) seems like it makes the precision of their recall a little less important. We don't need people to be perfect historians of everything they know; they just need to know enough to understand the task. A sea of LLM agents could probably do something similar, where the point isn't to retrieve an exact fact, but to be able to approximate some facts well enough to do something that's generally productive. If they hallucinate 1 out of every 10 facts, eh, that's probably ok.
Well and it’s not as easily measurable - but people “hallucinate” all the time. In that they get the facts wrong. The AI doesn’t need to be perfect just right more often than an average human at the task. Of course some fields and jobs require hirer levels of precision than others- so I’m just speaking of broadly of jobs in an average business.
For sure. (And that's always been one of the things that's kinda seemed weird to be about people's perceptions of AI. They see it as broken if it ever gets anything wrong, even though humans get stuff wrong all the time. Which I guess sorta makes sense, because it's a computer, and there's something very strange about computers getting something wrong, like if Excel just sometimes messed up the math.)
It seems that "high quality text" will almost need a clear quantification of "quality" in the future for accurate monetization to occur.
My 2024 brain has zero ideas for quantifying the quality of text across the web. I could start by saying heuristically that the average NYT article would probably be at least a 7/8 out of 10, while The New York Post couldn't be more than a 4/5 out of 10. But then, after major news outlets, you get into weird data sources like Reddit where r/wallstreetbets would score a 0/10 but a subreddit like r/personalfinance could score an 8/10, and then of course also vary by user and context.
As I'm writing this, it seems that only large organizations like the NYT would have the argumentative and lobbying/lawyer power to ever have their text monetized. And then how do you monetize it?
Interesting read and thought provoker. I originally thought the court case was a bit silly from the Times but now I agree with your point that they are just looking to "make money from that tide"
It seems like the OpenAI et al basically have that kind of scoring mechanism in their training weights. NYT is high, Wikipedia is high, Reddit and the internet is low, etc. I could see that getting more precise over time though, especially if the Times wins and they have to be more particular about how they're sourcing training data.
That said, I'm sure if that happens, the temptation will be to use LLMs to assess the training data. Which will turn the whole thing into an ouroboros that's eating itself twice - LLMs will start to get trained on LLM-generated content, and they'll score the quality of that content using LLMs built from roughly the same material. Which I dunno, maybe that works out fine?
That makes sense that they might use an LLM to assess the LLM training data. Hah yeah that could get messy fast!
I was just having a conversation the other day with someone who thought of an LLMs usefulness just being for what it "knows". We talked about an example of an LLM that is a company historian that knows what decisions were made and why they were made at your company. Just as soon as this idea came up we discussed the problems with a historian. Granted our problems were mostly unrelated to licensing - although privacy/HR was a potential problem. However - the biggest consideration was how companies would want to redact certain information, restrict other information, and completely forget a 3rd set of information. Those would be pretty difficult activities to achieve with 100% accuracy - and mistakes could be costly, dangerous, harmful etc. Plus what company (or person) actually wants a perfect memory of what happened - we choose to forget stuff all the time. :-) Anyways - all that to say I'm in the camp of "doing" being more useful than "knowing". However, I think part of making the doing useful is referencing external data through databases, apis, and web-crawling (like a search engine would). Therefore the knowing is less Oracle like and more search engine like. However if the Oracle route was chosen I agree - there is no way the thing would work without lots of copy-written material.
Yeah, that gets to a point I didn't really talk about here, which is that there's definitely a fuzzy between the two. Like, does a lawyer know things or do things? Both?
But, I think that's ultimately where the internal LLMs end up, is we treat them like lawyers. We'll still have to teach them a bunch of stuff - train them on internal data, jargon, whatever - but the primary purpose of that is to complete tasks for us. Which (and I'm totally guessing here) seems like it makes the precision of their recall a little less important. We don't need people to be perfect historians of everything they know; they just need to know enough to understand the task. A sea of LLM agents could probably do something similar, where the point isn't to retrieve an exact fact, but to be able to approximate some facts well enough to do something that's generally productive. If they hallucinate 1 out of every 10 facts, eh, that's probably ok.
Well and it’s not as easily measurable - but people “hallucinate” all the time. In that they get the facts wrong. The AI doesn’t need to be perfect just right more often than an average human at the task. Of course some fields and jobs require hirer levels of precision than others- so I’m just speaking of broadly of jobs in an average business.
For sure. (And that's always been one of the things that's kinda seemed weird to be about people's perceptions of AI. They see it as broken if it ever gets anything wrong, even though humans get stuff wrong all the time. Which I guess sorta makes sense, because it's a computer, and there's something very strange about computers getting something wrong, like if Excel just sometimes messed up the math.)