Barely related, but I once heard a story (which I wanted to include in this, but couldn't find any actual evidence of) that some luxury vodka (I think Belvedere) got popular because the company sent a bunch of models out to bars in LA to order it. So very attractive people would walk into bars, loudly ask for a Belvedere and soda or something, and then leave when the bartender said they didn't have it. So bars were like, these are the kinds of people we need in our bar, I guess we need to start carrying Belvedere. And other customers were like, I'll have what the models are having. And then it became popular.
The underlying premise here is that "the answer" can be found in simply passively observing or collecting the status quo (through different means and with different analysis tools). Wouldn't another option (that does directly apply outside of the save-the-pub context) be to make a list of things that you could change that *might* have a positive impact, and then make those changes carefully and observe the results: change up the music selection, change up the sound level, change up the lighting, etc.?
Through that lens, generative AI can become a useful brainstorming companion: "I have a bar that is [describe "failing bar" characteristics] and would like to make some changes to make the bar more successful. Give me a list of 20 things that are reasonable changes to try..."
That kicks into more of a scientific method approach: cull through that list and treat it as hypotheses, refine them and see what others ones that lists sparks, prioritize them (possibly by applying human judgment to how quickly you would expect to see an impact and what the expense would be to try the change), figure out how you are going to determine if each one "worked," and start rolling them out.
What's happening with the status quo doesn't always hold the answer as to how to drive improvements (if you're concerned that your storefront is not getting sufficient foot traffic, studying the movements of the people who *are* coming to the store is like the drunk looking for his keys under the streetlight because that's where the light is, even if it's not where he dropped them).
So sure, I think that's true, though that feels a little bit like cheating, because you're creating a bunch of interventions. To me, the question was more like, "How can we diagnosis the problem?" If I went to a doctor and they said they could diagnosis it by trying 20 surgeries and seeing which one fixed me, I think that 1) they'd probably figure out how to fix me but 2) I'd definitely not let them do it. So part of the premise to me is how do you solve the problem without all the surgeries.
But yeah, to your point, for sure, trying new things will probably uncover a lot more than just watching; I agree with that. But, as far as methods for diagnosing the problem goes, I'd generally say that watching people will uncover a lot more than looking at data. Which isn't to say that watching is the best. I just think it's often *better.*
That's interesting to think about it through a medical lens. The counter-perspective would be to think about how pharma companies and non-drug protocols get developed—exactly *through* interventions that are tried either as controlled experiments or, if there are logistical or ethical reasons that that's not feasible, then through longitudinal analysis, right?
Businesses are "doing stuff" all the time, so why not do stuff with more discipline so that you're looking ahead with two goals in mind: 1) do the stuff and hope it has a positive impact, AND 2) do the stuff in a way that ensures I'm able to quantify the impact with some confidence.
It's that second part that's missing—the hope that just "doing stuff" naturally introduces sufficient variation in the data that some analytical technique (if we can just find the right one!) will be able to tease out causation seems dangerous.
I hope the pub sticks around for future posts. I absolutely agree that "watching" (and thinking... and that thinking, perhaps, can be AI-assisted) can feel anecdotal and, as such, get dismissed too quickly. I'd love to see where your mind takes a scenario that the pub is actually unsave-able—there is something structurally involved that *no* amount of looking at the data (including running a bunch of interventions) is going to find a way to get it back on track. If we went back 100 years and brought AI and ML and a wealth of detailed data to the buggy whip industry, what would the analyst have been able to bring to the the CEO of Big Buggy Whip to help him turn around his business?
Yeah, that's all fair - my point definitely wasn't that interventions don't work or that observational analysis is better. But, to your later point, I think we tend to discount it, for reasons that are similar to discounting unstructured data, which is that "observation" is hard to manage and work with. But I think that a bunch of researchers conducting interviews and perfectly aggregating their opinions would probably be more valuable than most analysis, and I'd probably same the same thing about a bunch of people observing stuff, and then aggregating up the interesting things they saw.
And I have no idea if the bar will last, but if there was ever a doomed business, it would be a bar I run. (Though I agree that your question is interesting. Like, could data have saved Kodak? We sort of implicitly argue that it could in how we sell data today - "you need it to be competitive!" - but I wonder if it actually would've.)
I really like your perspectives on data in the real world. I was wondering if you think that there will be an industry shift when the actualized ai tools hit the market? Easier access to "findings" can show the failings in extrapolation without context and also the lack of actionable steps that really make the revenue differences they are hoping for (data is def a ponzi scheme at least 60% of the time). As someone who has bills being paid depending on the analytics cost center being in good vibes, do you think that data is still a smart thing to go into?
Thanks! And I don't think it's a great bet, to be honest. Though that's less about AI to me, and more about how data being a bubble over the last 5-10 years, as being this hugely transformative thing. That doesn't mean it's bad or that data teams will go away or whatever, but it seems hard for it not to regress from where it's been.
And I'd guess that AI doesn't help, though not really by replacing analysts or whatever, but by taking more air out of the room. People thought the way to solve problems and find a competitive edge was data, and thought it was going to like a 9/10 in importance. That bubble burst some, so now it's a 6/10. But also, people now think AI is the way to do that, which will take some more energy from traditional data stuff and make it a 4/10 or something.
I appreciate your response! Yes, that checks out for sure. It's funny to see AI washing and the silver bullet/doomsday narrative take root so strongly with very few people diving to find its actual usefulness. With your understanding of the industry, I was wondering what tech/tech adjacent paths seem the most resilient to you (5-10 years) for the general STEM background?
Maybe this is too simple of an answer, but it's hard for me to imagine engineering roles not remaining really valuable. So far, doing anything with AI takes a lot of manual work to implement, and even if AI agents get to be really good engineers, someone's going to need to make sense of what they create. Plus, whatever the latest capabilities of AI are, engineers can always build more stuff on top. So until we get to the point that every AI agent is basically indistinguishable from some brilliant human - at which point all bets are off - it seems like there will always be space for more engineers.
more information on the observers from Fringe that nobody asked for -
"Observers were evolved humans from one possible future of mankind. In an attempt to ensure their existence and brain evolution, they used their time period's technology, which allowed them to travel through time and space. Because of that technology, they existed quite literally "outside" of time. In their own future, the world is damaged beyond repair and unsustainable. Their endgame was to rise to a position of totalitarian power in the past, which they assumed in 2015"
So 1 is largely top-down/hypothesis driven and 3 is truly bottom-up/exploratory (more than any techniques today allow)? ... if 3 doesn't work or isn't possible for a while, is there any way to shift the way 1 works to be truly more bottom up? Perhaps something like AI driven anomaly detection/trend identification en mass?
Yeah, I think that's mostly right (though 2 is probably even more exploratory, actually). I'm not sure I have a good answer for how to do 1 in a bottoms up way, to be honestly. I think it either has to come from 3 (or 2), or from just like, coming up with some guesses in your head. In theory, you could do some kind AI "find me some insights" thing, but it so far nothing like that works very well.
(One potential caveat to that is AI might not be bad at hypothesis generation? Not based on data, necessarily, but just brainstorming. Ie, "give me 100 reasons why my bar might not be working." It's not bad at that sort of thing, tbh.)
Yeah that makes sense. I had a similar line of thinking that "AI insights" will more importantly require the ability to ask the right questions (or generate the "right" hypotheses in this example) over just being able to answer questions (the text2sql approach). I've prototyped something that takes this approach - let me know if you'd be interested to take a look?
Right when you published this article, I saw a link to Google Research's foundational time series model:https://github.com/google-research/timesfm?tab=readme-ov-file . Here's the introduction blog: https://research.google/blog/a-decoder-only-foundation-model-for-time-series-forecasting/ which was published around the same time. From what I understand they are both Decoder first time series foundational models. It seems that Google Research has a few other models they are benchmarking with. Motif seems very focused on a practical Analytics implementation, where Google Research, well for now, researches. That said more is being done in this space. Maybe smarter folks than me can explain the difference between both approaches.
This is kinda wild. As I understand it, they've basically built a foundational model that just takes generic time series inputs and forecasts what happens next, without any knowledge whatsoever about what those time series points are. Though there are other versions of that (like this one, which was built by one of the founders of Motif, actually), there's something weirder about how the Google one was explicitly trained on a bunch of old time series. Because the rough implication there is that your time series will follow some sort of pattern that can be predicted by the patterns other, completely unrelated time series have followed. Which, I guess is true, but still seems wild.
Yeah, in a way, but not necessarily in the formal sense. I meant more like, if you watch a basketball game and say "this team is playing with a lot of energy," you'd probably come to that conclusion without ever looking at a stat sheet. There are no tables or numbers behind that; it's just a vibe thing.
You can also pretend to be a customer. In other words: the Renato Rosaldo method of 'deep hanging out' (Geertz 1998)
Barely related, but I once heard a story (which I wanted to include in this, but couldn't find any actual evidence of) that some luxury vodka (I think Belvedere) got popular because the company sent a bunch of models out to bars in LA to order it. So very attractive people would walk into bars, loudly ask for a Belvedere and soda or something, and then leave when the bartender said they didn't have it. So bars were like, these are the kinds of people we need in our bar, I guess we need to start carrying Belvedere. And other customers were like, I'll have what the models are having. And then it became popular.
Or so I've been told.
The underlying premise here is that "the answer" can be found in simply passively observing or collecting the status quo (through different means and with different analysis tools). Wouldn't another option (that does directly apply outside of the save-the-pub context) be to make a list of things that you could change that *might* have a positive impact, and then make those changes carefully and observe the results: change up the music selection, change up the sound level, change up the lighting, etc.?
Through that lens, generative AI can become a useful brainstorming companion: "I have a bar that is [describe "failing bar" characteristics] and would like to make some changes to make the bar more successful. Give me a list of 20 things that are reasonable changes to try..."
That kicks into more of a scientific method approach: cull through that list and treat it as hypotheses, refine them and see what others ones that lists sparks, prioritize them (possibly by applying human judgment to how quickly you would expect to see an impact and what the expense would be to try the change), figure out how you are going to determine if each one "worked," and start rolling them out.
What's happening with the status quo doesn't always hold the answer as to how to drive improvements (if you're concerned that your storefront is not getting sufficient foot traffic, studying the movements of the people who *are* coming to the store is like the drunk looking for his keys under the streetlight because that's where the light is, even if it's not where he dropped them).
So sure, I think that's true, though that feels a little bit like cheating, because you're creating a bunch of interventions. To me, the question was more like, "How can we diagnosis the problem?" If I went to a doctor and they said they could diagnosis it by trying 20 surgeries and seeing which one fixed me, I think that 1) they'd probably figure out how to fix me but 2) I'd definitely not let them do it. So part of the premise to me is how do you solve the problem without all the surgeries.
But yeah, to your point, for sure, trying new things will probably uncover a lot more than just watching; I agree with that. But, as far as methods for diagnosing the problem goes, I'd generally say that watching people will uncover a lot more than looking at data. Which isn't to say that watching is the best. I just think it's often *better.*
That's interesting to think about it through a medical lens. The counter-perspective would be to think about how pharma companies and non-drug protocols get developed—exactly *through* interventions that are tried either as controlled experiments or, if there are logistical or ethical reasons that that's not feasible, then through longitudinal analysis, right?
Businesses are "doing stuff" all the time, so why not do stuff with more discipline so that you're looking ahead with two goals in mind: 1) do the stuff and hope it has a positive impact, AND 2) do the stuff in a way that ensures I'm able to quantify the impact with some confidence.
It's that second part that's missing—the hope that just "doing stuff" naturally introduces sufficient variation in the data that some analytical technique (if we can just find the right one!) will be able to tease out causation seems dangerous.
I hope the pub sticks around for future posts. I absolutely agree that "watching" (and thinking... and that thinking, perhaps, can be AI-assisted) can feel anecdotal and, as such, get dismissed too quickly. I'd love to see where your mind takes a scenario that the pub is actually unsave-able—there is something structurally involved that *no* amount of looking at the data (including running a bunch of interventions) is going to find a way to get it back on track. If we went back 100 years and brought AI and ML and a wealth of detailed data to the buggy whip industry, what would the analyst have been able to bring to the the CEO of Big Buggy Whip to help him turn around his business?
Yeah, that's all fair - my point definitely wasn't that interventions don't work or that observational analysis is better. But, to your later point, I think we tend to discount it, for reasons that are similar to discounting unstructured data, which is that "observation" is hard to manage and work with. But I think that a bunch of researchers conducting interviews and perfectly aggregating their opinions would probably be more valuable than most analysis, and I'd probably same the same thing about a bunch of people observing stuff, and then aggregating up the interesting things they saw.
And I have no idea if the bar will last, but if there was ever a doomed business, it would be a bar I run. (Though I agree that your question is interesting. Like, could data have saved Kodak? We sort of implicitly argue that it could in how we sell data today - "you need it to be competitive!" - but I wonder if it actually would've.)
I really like your perspectives on data in the real world. I was wondering if you think that there will be an industry shift when the actualized ai tools hit the market? Easier access to "findings" can show the failings in extrapolation without context and also the lack of actionable steps that really make the revenue differences they are hoping for (data is def a ponzi scheme at least 60% of the time). As someone who has bills being paid depending on the analytics cost center being in good vibes, do you think that data is still a smart thing to go into?
Thanks! And I don't think it's a great bet, to be honest. Though that's less about AI to me, and more about how data being a bubble over the last 5-10 years, as being this hugely transformative thing. That doesn't mean it's bad or that data teams will go away or whatever, but it seems hard for it not to regress from where it's been.
And I'd guess that AI doesn't help, though not really by replacing analysts or whatever, but by taking more air out of the room. People thought the way to solve problems and find a competitive edge was data, and thought it was going to like a 9/10 in importance. That bubble burst some, so now it's a 6/10. But also, people now think AI is the way to do that, which will take some more energy from traditional data stuff and make it a 4/10 or something.
I appreciate your response! Yes, that checks out for sure. It's funny to see AI washing and the silver bullet/doomsday narrative take root so strongly with very few people diving to find its actual usefulness. With your understanding of the industry, I was wondering what tech/tech adjacent paths seem the most resilient to you (5-10 years) for the general STEM background?
Maybe this is too simple of an answer, but it's hard for me to imagine engineering roles not remaining really valuable. So far, doing anything with AI takes a lot of manual work to implement, and even if AI agents get to be really good engineers, someone's going to need to make sense of what they create. Plus, whatever the latest capabilities of AI are, engineers can always build more stuff on top. So until we get to the point that every AI agent is basically indistinguishable from some brilliant human - at which point all bets are off - it seems like there will always be space for more engineers.
more information on the observers from Fringe that nobody asked for -
"Observers were evolved humans from one possible future of mankind. In an attempt to ensure their existence and brain evolution, they used their time period's technology, which allowed them to travel through time and space. Because of that technology, they existed quite literally "outside" of time. In their own future, the world is damaged beyond repair and unsustainable. Their endgame was to rise to a position of totalitarian power in the past, which they assumed in 2015"
https://fringe.fandom.com/wiki/Observers#Endgame
Not the kind of beings you want paying any special attention to your bar. For sure.
I mean, I've been to a few bars that in which time pretty much lost all meaning, so they might've already found some.
Good point.
So 1 is largely top-down/hypothesis driven and 3 is truly bottom-up/exploratory (more than any techniques today allow)? ... if 3 doesn't work or isn't possible for a while, is there any way to shift the way 1 works to be truly more bottom up? Perhaps something like AI driven anomaly detection/trend identification en mass?
Yeah, I think that's mostly right (though 2 is probably even more exploratory, actually). I'm not sure I have a good answer for how to do 1 in a bottoms up way, to be honestly. I think it either has to come from 3 (or 2), or from just like, coming up with some guesses in your head. In theory, you could do some kind AI "find me some insights" thing, but it so far nothing like that works very well.
(One potential caveat to that is AI might not be bad at hypothesis generation? Not based on data, necessarily, but just brainstorming. Ie, "give me 100 reasons why my bar might not be working." It's not bad at that sort of thing, tbh.)
Yeah that makes sense. I had a similar line of thinking that "AI insights" will more importantly require the ability to ask the right questions (or generate the "right" hypotheses in this example) over just being able to answer questions (the text2sql approach). I've prototyped something that takes this approach - let me know if you'd be interested to take a look?
As in, something that helps come up with ideas to explore, rather than trying to write SQL to answer questions? Yeah, that'd be cool to check out.
Right when you published this article, I saw a link to Google Research's foundational time series model:https://github.com/google-research/timesfm?tab=readme-ov-file . Here's the introduction blog: https://research.google/blog/a-decoder-only-foundation-model-for-time-series-forecasting/ which was published around the same time. From what I understand they are both Decoder first time series foundational models. It seems that Google Research has a few other models they are benchmarking with. Motif seems very focused on a practical Analytics implementation, where Google Research, well for now, researches. That said more is being done in this space. Maybe smarter folks than me can explain the difference between both approaches.
This is kinda wild. As I understand it, they've basically built a foundational model that just takes generic time series inputs and forecasts what happens next, without any knowledge whatsoever about what those time series points are. Though there are other versions of that (like this one, which was built by one of the founders of Motif, actually), there's something weirder about how the Google one was explicitly trained on a bunch of old time series. Because the rough implication there is that your time series will follow some sort of pattern that can be predicted by the patterns other, completely unrelated time series have followed. Which, I guess is true, but still seems wild.
https://facebook.github.io/prophet/
Talking to and watching people is also data, though. 🤔
Yeah, in a way, but not necessarily in the formal sense. I meant more like, if you watch a basketball game and say "this team is playing with a lot of energy," you'd probably come to that conclusion without ever looking at a stat sheet. There are no tables or numbers behind that; it's just a vibe thing.