A data intelligence platform doesn't turn natural language into math. It runs math on natural language.
Ahaha, a few years ago I was a minor stake owner of an ice cream shop and the people actually doing the hard work asked me to help them w/ my "data skillz". But since the other owners were ex-accountants, they could manage their books and costs already and there was honestly nothing data-wise worth doing.
A lot of my success as an analyst probably comes from my social science background because out of my colleagues, I'm often the first to roll up my sleeves and sit down to hand code 200+ open-ended text feedback messages in a single afternoon until my brain rots. And it's the bridging of the qual+quant that people list and stuff works out. Except a lot of people don't like the tedious work involved =\
For a moment I thought you might touch on the mystical and necessary element of belief to direct inquiry.
On a jog the metaphor of water came to me: in the Oceans of Data, there is a Sea of Information that contains Observable Truth. We can observe by putting out rain buckets (experiments) to collect water for our Data Lake. The circumstances for collecting data is an exercise in belief. In science it is the belief in a model from which we generate the hypotheses. That is an exercise in judgement. The bronze water we put into the Data Lake has contaminant artifacts of the experiment including elements of bias in the observers willingness to perceive.
In the context of your story, the unstructured video data is the qualitative data which leads us closer to the Truth or Sea of Information. From which we can also glean quantitative data once we have some basis for a model.
And then there are the red herrings such as Survivorship Bias (https://en.wikipedia.org/wiki/Survivorship_bias). In the patching and reinforcing of the bullet holes from planes that returned (rather than when they didn't return), perhaps interviewing the target demographics who aren't there would have more useful especially if the pool of people interviewed was small or not representative. This is baked into our belief about what can be True and our ability to observe
Others have said this more completely and eloquently. Thank you for the article.
typo here?: "unlike quantitative observations, which can only be seen one at a time" should probably be "qualitative"
Quantitative is just aggregate qualitative. If you have a red apple and an orange you have two different qualities; but if you have different quantities of apples and oranges you can start to compare them--or at least make a nice pie chart ;-)
The encoding and "sample rate" chosen is what determines what categories (or bins) are in the survey. That's why it is important to have both open-ended surveys and validate/verify hypotheses by encoding qualitative findings into a quantitative survey.
Yes - I think lots of people would have an appetite for Gong without guardrails - more control with a SQL like languages and functions etc. Did we just come up with a Warehouse Native Gong competitor? 😁
It would be awesome to say - highlight the most unusual review - or show me a clustering around key topics. Like a k-mean cluster style.
It’s also interesting to give the LLM the full context on things like this and see what it could do. Like - I’m trying to make X product better. Here is what it does now - go listen to all these interviews and make it better based on your knowledge and these interviews.
Your "dropbox" application is a very exciting application of LLMs for me. While maybe they don't use LLMs - yet - my favorite similar example of this is Gong.io. I think they do an excellent job of helping you mine useful info from recorded calls.
However - taking this straight to SQL via databricks, bigquery, snowflake, etc would be awesome. I could actually see analysts using functions like sentiment(), summarize(), etc. I hope this promotes more customer interviews and a better cycle of getting feedback from customers incorporated into products resulting in better products. However, I wonder what nuances these systems will miss that humans would have picked up on via manual review of interviews. Back to the gong.io example - searching keywords and manually reviewing is my favorite gong.io workflow, but maybe LLMs will make this next level.
So well said, a typical trap to fall into. Thank you for the story!
This struggles from a question of categorization. Let's say you have 1000 surveys, and decide to summarize, well in that case you can imagine "summarize" as something like "apply topic model to surveys" https://en.wikipedia.org/wiki/Topic_model (using this because it's literally something that exists already, and so it's easier to know the strengths and flaws than if we just say "LLM" and assume it is magic.)
The problem though is that unless these categories are fixed in advance (in which case, this is already getting pretty close to a categorical metric) the results could shift wildly if you add 200 more surveys, and run the model again. The combination structure can just continually veer in a wildly different direction. (and we may expect that with an LLM as well)
And then, you may start to have natural questions if there are cohort differences. So, you might say "summarize (first 1000)" & "summarize (last 200)", and then that will provide a 3rd type of summary. And it becomes a judgment call whether you need 1, 2, or 3 summaries to really make sense of the problem.
And hopefully this clarifies where the problem goes. The problem with "Data" has never really been tools or numbers, but complexity management. The analyst is SUPPOSED to help the organization manage cognitive load, by hierarchicalizing information, and highlighting it in the right contexts. (Whether this happens appropriately or not is a separate question)
And if you've ever presented a data-backed story to an executive, one of the pain-points that really comes up is that this person is really needing you to pre-process the complexity for them to act as a tour-guide for how to approach the world. And I don't think this is new, even the fore-runners of analytics like the consultants and industrial engineers were just digging into a finer grain of problem to help optimize a situation.
This is a lovely shift of perspective. Technically, I think we can solve it with just a few lines of code up to reasonable sized text input. Lets just assume you found a way to dump your raw text into a data warehouse then you can define a user defined function that calls an LLM for a summary. (This is supported by most big Data Warehouses.)
Then write a query like this:
select d1, d2, user_defined_summarize_function(array_agg(text_column)) summary_column
group by d1, d2
If this needed to scale, summarize, summaries by gradually increasing the aggregation level e.g via a few CTEs. This could look as follows:
with summaries_3d as (
select d1, d2, d3, user_defined_summary_function(array_agg(text_column)) as summary_column
group by d1, d2, d3)
select d1, user_defined_summary_function(array_agg(summary_column)) as summary_column
group by d1)
Of cause this does not feel as native as LLM based aggregate functions, but I would assume and hope it's just a matter of time until it becomes available.
Yes in retrospect, I've learned Halloween is for kids
“Michigan University”? As MSU grad, I like it.
They are doing this against saved verbal conversations, at scale and developing their own algorithms to interpret, then feed back to clients suggestions to improve. I loved the comment their CMO said, "Surveys only capture those who love you or hate you, not much in between.
PS. Running a bar is starting to sound a lot more satisfying than doing data work.
Really enjoyed this one. If AI can eventually do that, really valuable.
🔥 the Daniel Plainview reference 🔥
I dressed as him once for Halloween, caught a few fans with the bloody bowling pin