Analytics commoditised
We're entering a key period of progress in data access and interoperability
I recently thought that there are only three ways to execute in data: internally as a data team, externally in consultancy and as a software provider.
What if there is a fourth way? In a two sided marketplace model like Uber, Lyft, Deliveroo etc: instead of drivers and customers as the two sides, you could have data professionals and data consumers (this does already happen to a level with Upwork). The reason why this works today for these other industries is that it’s entirely possible to provide the full context for a task from one side of the marketplace to the other: the driver knows the passenger wants to go from A to B at a specific time and the driver is given navigational instructions to follow to collect the passenger and drive to the destination. As it stands today this is not possible for analytics. There is just too much context that takes a long time to acquire; even for experienced data professionals, it often takes a month to understand the data processes/structures and how they relate to an organisation’s operations.
This complexity is why there are so many MDS consultancies and why engagements are long and require ongoing support. I think the complexity of the data is much more the reason these specialists are needed than any complexity around an unbundled data stack. I’m certain that, at least for the rest of my career, there will continue to be a need for these consultancies, as there will always be orgs who aren’t able to internally manage data. There will also be orgs who focus on their core service or product and outsource everything else including Legal, Finance, Tech and Data; this is a valid choice of operating model and there are many orgs globally who successfully use it.
Standardisation in Data Engineering has made good progress, with lots of money invested into the space to help this happen. Analytics engineering is now an accepted discipline with well defined tooling, but standardisation of how metrics and entities are defined has not been widely adopted yet - maybe 2022 is the start for this penultimate leg of the data relay race.
Does the advent of metrics layers, on top of well-defined data models in a standard format, drive a paradigm shift? What you get with a metrics layer is similar to a self-documented API, rather than having to figure out how to calculate any given metric for an organisation… this is already defined up front. If all or most organisations adopt metrics layers of a similar format, whether created by internal or consultancy data professionals through analytics engineering, then interacting with them becomes easier and requires less context gathering.
If there is even further standardisation, including common definitions for the same metric across organisations (Gross Merchandise Value GMV, Orders, Lead Generation Rate LGR, Session Conversion Rate…), you can then have experienced professionals in Growth, Product, Revenue, Finance… X data pick up any organisation’s data and apply their skills. Metrics layers can (or already do) allow you to apply those special filters to metrics to align you with how an organisation operates eg Orders doesn’t include test and Revenue doesn’t include chargebacks. This could allow more non-technical people to “work like an analyst” too, by abstracting the data model. They then only need to know how to make simple metrics layer queries, more akin to an Excel formula. Instead of arguing about how a metric was calculated, we could actually make some decisions or have deeper research conducted.
Just as it enables data professionals to more easily operate, it also allows for better integration with any tool that uses or displays metrics and their definitions:
analytical research - Hex
data app - Streamlit
ML platform - Continual
self-serve metrics - Lightdash
discovery - Atlan
business observability - Avora
With a hosted way to access the metrics layer, thus abstracting data access credentials, these tools could use pre-defined metrics in seconds. It could be like an app marketplace based on the metrics layer.
There is still a lot of work needed to get us to this place, but I think the wave is big enough to push us there. There will be a future dichotomy between orgs which have adopted a metrics layer and those which haven’t. The former having faster and cheaper access to professionals and apps and even faster output of a higher quality. The latter struggling to manage, disseminate and use data as effectively.
The two-sided marketplace is a great analogy. I know that when I switched jobs, being able to "plug in" and have opinions about the new data platform I was working in was a game changer, and it was only because they had implemented dbt. You might be onto something that the metrics layer could lead to even more pluggability.