I also sometimes wonder if the increased reliance on consultancy firms is a net benefit. I'm from one myself. I'm not speaking against our own company. We do good work and our clients see results. But as you are questioning how data teams work and thinking of how to improve this, it's worth taking into consideration. Especially now with the affordability of the tooling smaller and mid-tier companies are able to set up their own data stack.
As consulting analysts and analytics engineers the greatest perk is that you are able to work on many different problems. You can learn fast and re-apply what you learn across accounts. If you're waiting on access or approval for one account you can switch to a different one and keep on going. Another benefit is that you (often) work with a larger group of various talents that you can rely as a soundboard or for direct support. For some companies it would not be possible to hire the same expertise full-time. Or given the average tenure of people in the space does it warrant investment into in-house staff?
There is however a loss for companies as well. Does working with external teams allow for strategic focus? Long term thinking? How does it impact focus on the most important problems? How does it impact feedback loops and iteration that are so vital for developing healthy data products? We've helped multiple companies hire and train their own team during engagements. But striking the right balance is no easy feat and should be part of the discussion as we are looking to improve as a field.
I took over managing Yammer's Vertica cluster in 2013, and I figured out why nodes were going down. The spread daemon, which is basically the control plane for the cluster, was competing with queries for network and CPU, and a few seconds of starvation during a heavy query would trigger a partition event. I found that nice'ing spreadd up to realtime priority eliminated node failures.
Vertica originally recommended separate switches for this traffic, and this type of hiccup made it hard to simply switch to cloud or generic servers with one network interface. The cloud at that time also lacked data-intensive instance types -- I used to test Azure instances every few months to see if they could even get near our I/O specs, and I never found one. It took years for cloud data servers to become available.
Maybe I'm a little grizzled, but the problem I'm having nowadays is how disappointing the new offerings are. I've always gravitated towards high-value, performance critical stuff like analytics-based web apps, and I still haven't found anything that beats Vertica (which is now available as a SaaS by the way).
I tend to agree to the most of the points. I could draw parallels to "Analytics / AI / ML" world too. Essentially, people are becoming more tool centric and attribute their success (or failure) to the tool's features (or lack of them). When we say "data science", often we tend to overlook " Science " part of the things. Usually, it starts with a good hypothesis or by asking a right question. That is an art which is missing very badly these days. I think last paragraph in your article captures the importance of asking the right questions.
Chuck, a soccer buddy of mine, was a professional still photographer working in the movie industry 20 years ago. When digital photography came out he scoffed and said that a better word processor doesn’t make you a poet so the industry would always need experts like him. Then the consumption of photos moved suddenly from glossy print magazine ads and 27 x 40 inch posters to 468 x 60 pixel online ads. No poetry needed in 468 x 60 pixels. Chuck lost his photographer role and instead became an expert in organizing thousands of digital photos and managing the distribution of ads to online networks.
Caveat: I’m an evangelist for a data/AI/machine learning platform vendor. (If the following is too salesish let me know.) IMHO some data teams’ impact hasn’t changed much because they’re still doing old tasks that are now less valuable, like taking still photos. The cheese moved. Instead of central data teams being “honorary members of the marketing and customer success leadership,” those departments often now have actual team members who are data experts. A new, valuable role for the data team is to empower those data experts at the edges of the business by providing them with powerful, self-service platforms and best practices.
We have hundreds of customer success stories to back it up including Standard Chartered Bank’s 3,000% productivity increase, a multinational telecom’s 16,000% productivity increase, and Unilever’s data-driven evaluation of new product ideas that’s 100 million times faster than their previous method.
Hi Ben !! May be the perceived value is a factor of investment as well. Years back, cost of data infrastructure was 4-6X cost of data professional salaries, now it's a fraction of it. Then it was more about " let's respect, hard to find folks who can get some value out of the large investment we have done", vs "we have invested so much in these folks, they better deliver X times"
I think so generally.
"Why has data technology advanced so much further than value a data team provides?" We need to stop using the term "data team" -- instead use "team".
having lived through the era you describe, there were more generalists in the field because they had to be to juggle the tooling at their disposal. So far appears to be a zero sum game between tool advancement and skill capacity.
Welcome back. I’m glad you survived the election and the Twitter apocalypse. :-)￼