The metrics are inconsistent.. again!
Showing consistent and correct numbers for important KPIs should be a solved problem by now. However, it is surprisingly difficult to make it work in the modern data stack if your team is not satisfied with only 1 pizza 🍕 for lunch.
To achieve metric consistency at scale, Airbnb, a company with one of the strongest analytics capabilities invented the metric store. They actually call her Minerva.
‍
Why do we need consistent metrics?
Decision-makers often have access to various sources of reports and analysis. If the numbers don’t match, they lose trust in the data and have an excuse to not make data-driven decisions.
People start to accept that numbers don’t match. Our much-coveted data culture goes down the drain. Important data quality errors are hidden behind the excuse that probably the number is just different because the formula is different.
How can decision-makers trust our cool black-box AI, if we can not show the same revenue numbers across two Tableau reports?
‍
What is currently done?
Today metrics are found in three places:
- BI Tools (e.g. Tableau)
- Datawarehouse (Snowflake ofc :)
- Operational Systems (e.g. Salesforce)
‍
1. Everything, so Nothing
Teams that have not established a clear analytics process often end up in a situation that looks like this.
This situation often evolves when teams adapt too much self-service too early.
‍
2. Pre-computed Results
A first iteration to at least agree on the most important KPIs is persisting metrics as tables. This helps, especially if there is also a process around the tables that ensures they are documented, tested, and discoverable by all stakeholders.
One big limitation of this approach is that we need to precompute all possible dimensionality aggregations. Revenue and similar metrics like Revenue Retention Rate can share one table for financial metrics, but only in one static dimension, e.g. per month. To look at revenue per week, or per product, or per country, or by a combination of these, a lot of tables or views need to be created.
The number of pre-computed tables you need, to cover all dimensional permutations is factorial!
What a sentence... Let’s look at an example.
To cover the dimensions: month, territory and product, you need 3! = 6 persisted tables.
To cover the dimensions: month, year, territory and product, you need 4! = 24 persisted tables.
Factorials grow faster than exponential functions. So this approach is always limited to a couple of dimensions and a few selected metrics.
‍
There is also the recently hotly debated OLAPÂ cube model that unrolls and persists all the dimensional permutations in one big and long table with an implicit hierarchy in the resulting columns. This has the same limitations as splitting the results in several tables, only that this time the table gets even more complex to understand and consume.
3. One tool, one Truth
BI Tools needed to overcome this challenge for a long time. People don’t just want to look at one dimension at a time, they want to dynamically slice and dice the data.
The solution is calculated fields in Tableau, calculated columns in PowerBI and LookML in Looker.
This approach works great, especially if one metric is only reported in one workbook or report. Sharing these computed fields across workbooks or teams is hard and often not happening.
There is also another big disadvantage. Your metrics are not accessible for any other consumers like Data Scientists with Jupyter notebooks. You can’t even use them for further transformations in your data warehouse.
‍
What is a metric store?
If it quacks like a duck, it is a duck. 🦆 So what does a metric store do?
1. A metric store treats the calculations on top of data as managed assets.
A computed field with important business logic should not live hidden in some Tableau workbook. It should be discoverable, documented, shareable, and clearly defined.
2. A metric store enables all consumers to dynamically segment a metric in different dimensions in a declarative way.
How the metric is aggregated or sliced is transparent to the consumer, who can just request “revenue by month, territory” to obtain the results. Ideally, this interface is in SQL, so any tool that can connect to a SQL database can also connect to a metric store.
‍
‍
The metric store lives at the intersection of transformed data and data consumers. Other companies attempt to build a metric store as a headless BI-Tool. We believe that is rough.
At Snowboard we are building a solution that lives in your Snowflake data cloud. So you keep the power of Snowflake when it comes to computing and performance, but also the governance aspects with one right and roles management system. Maintaining one is hard enough.
To learn more, reach out to us! 🙂
‍