Intro to Bigeye Collections
How do you manage data quality metrics at enterprise scale? In this post, we'll walk through it.
As your data observability operation grows in size and sophistication, manually managing individual checks isn't practical, or even feasible. Bigeye is an enterprise-ready data observability platform with helpful and proven ways to scale up the scope of monitoring. Bigeye will also let you increase the number of teammates that can contribute to the data engineering practice. In this post, I want to highlight Bigeye Collections (formerly SLAs): an efficient and powerful way to manage data quality metrics at enterprise scale.
Bigeye Collections provide three main capabilities:
1. Organize related metrics
2. Get a quick summary of collection performance and health
3. Consolidate and route notifications to the right people
Let’s dive deeper into those specific benefits:
Organizing related metrics
You could be a “full-stack data engineer” or a small team responsible for a small number of data pipelines. Whatever the case may be, when you initially deploy Bigeye, you’ll probably find that deployed metrics are relevant, and the default global view is sufficient.
But as your team and use cases grow, you could be responsible for monitoring hundreds, thousands, or even tens of thousands of tables and metrics. At this scale, trying to manage and make sense of everything is impractical and overwhelming.
To solve this conundrum, Bigeye Collections let you gather and organize related metrics—a simple act that produces powerful results by creating context and helping your team focus on the job at hand, including triaging relevant issues.
Presenting a summary
The Collection list view gives you a quick visual and descriptive summary of the status of your selected metrics. From there, you can quickly see which metrics are alerting (an upper bound on the number of open issues you should have) and easily determine if a Collection is healthy or may need attention.
As a data producer, you can get detailed "per metric" notifications if you are responsible for the ops. As a consumer, you can send summary notifications to folks who just need a traffic light level status. You can also drill down to view issues on the metrics and take actions on them.
Routing notifications
Dramatically save on toil by setting up notifications on your Collections. Instead of setting up a notification on each metric, you can create your Collection and have all the metrics associated with it send notifications to the same Slack channel, email, or webhook.
With this simple capability, you can control what groups send information to different channels. This filter helps manage different teams, different priorities, and different facets of a data pipeline. We’ll discuss the details in future blog posts.
Summary
Bigeye Collections are a key feature that helps you organize and scale as your data, your data team, and your data consumers grow. Customers using collections can handle one or two orders of magnitude more data, metrics, and teams.
Monitoring
Schema change detection
Lineage monitoring