4 Comments

I like the iterative approach that Viable uses. Dividing out tasks and having an instance of the LLM focused on that task. I have seen some really good results taking this a step further and building out personas. So you have Olivia Data Organizer, Tommy Theme Finder, Simon Summarizer each with focused skills that are relevant for the task at hand. I know it seems ridiculous but giving the model tons of context as to its skills, personality, abilities etc. seems to be a very effective way at focusing it and getting better results.

Expand full comment

Yeah, I think that's basically how they think of it too. They create what are almost employees, with very specialized tasks, and can hire as many as they want. It seems wild and not how we're supposed to think about using computers, but seems like how we're gonna have to start thinking about using computers.

Expand full comment

Generally agree that the SQL stuff is sketchy at best. I hate having to build up the mental model of someone else's SQL, CTE by CTE, .SQL by .SQL. If that person is as deranged and convincing as GPT, it can only be net negative, optimising for correct-looking rather than correct

The best use cases are seemingly creative rather than correct. Perhaps early data modelling is a good opportunity here. You could describe your business and existing data and systems to a few LLMs, they each develop a data model and then discuss the benefits of each. This isn't operationally integral so not really going to get much attention, but I'm wary of much else at this stage.

Beyond that, I think it is interesting to think about ongoing investigations, ie "can you dig into this unexpected anomaly", but giving the bot the capability to persist that instruction as an agent over time. Bots running the legwork on the exploratory angle, and ideally sense-checking each other before presenting would be quite interesting.

(to your analyst point, I was also a junior analyst, I would put a lot of effort into making correct-looking work, incentives!)

Expand full comment

Yeah, I agree on the creative side. And up until seeing the smol thing, that was basically what I thought was the limit of what we'd be able to get out of it. (as a question-answering bot, that is. I'd imagine there are lots of small things it could help with, to clean up ambiguous questions or whatever, but mostly in ways that are conveniences over some revolution).

The point about having them gut-check each other is interesting though. This is basically how that Viable product I mentioned works, where they (try to at least) keep errors from compounding by constantly having some other, lower temperature model evaluate each thing and see if sounds reasonable. It's not a perfect system, obviously, but really does just mirror how people do it as a group. And apparently, it works pretty well?

Expand full comment