Bigeye's State of Data Quality Report
We're pleased to announce the results of our 2023 State of Data Quality survey. Findings underscore the need for automation and better communication between data producers and consumers. Check it out!
We're pleased to announce the results of our 2023 State of Data Quality survey.
The survey, which was collated by Bigeye, consisted of 100 survey respondents. At least 63 came from mid-to-large cloud data warehouse customers (with a spend of more than $500k per annum) who have some form of data monitoring in place, whether third-party or built in-house.
Despite the efforts of data engineers, software engineers, and data analysts, who are typically responsible for data issues, issues still take anywhere from one to two days to weeks and even months to spot and fix. More than half of the respondents have experienced five+ data issues in the last three months.
First line of defense against data issues
Our survey found that data engineers are the first line of defense in managing data issues, followed closely behind by software engineers. The role of data engineer has now moved closer to that of software engineer. Like software engineers, data engineers are in charge of a product - the data product - that increasingly demands software-like levels of process, maintenance, and code review.
Desire for automation
Respondents who used third-party data monitoring solutions found approximately a 2x to 3x ROI over in-house solutions. They also noted that at full utilization, third-party data monitoring solved for two issues: fractured infrastructure, and anomalous data. They further reported that third-party data monitoring solutions had better test libraries, and a broader perspective on data problems.
Data incident frequency
Research revealed that companies are experiencing a median of five to ten data incidents over a period of three months. These incidents range from severe enough to impact the company's bottom line, to (merely!) reducing engineer productivity. These incidents take an average of 48 hours to troubleshoot.
Organizations with more than five data incidents a month are essentially lurching from incident to incident, with little ability to trust data or invest in larger data infrastructure projects. They are largely performing reactive over proactive data quality work.
Other important findings from the survey
There were other interesting insights revealed through survey results, including:
- Respondents told us 37,500 man hours to build an in-house data quality monitoring solution
- Roughly, that equates to one year of work for 20 engineers
- 70% of respondents reported at least two data incidents that diminished the productivity of their teams
- Data issues most commonly take ~1-2 days to spot and fix, but with a long tail lasting up to weeks and months
- Of respondents reported at least two “severe” data incidents in the last six months, which created damage to the business/bottom line and were visible at the C-level
Check out the full report for a complete write-up of these findings.
And, to learn more about how to ramp up your data quality initiatives in your organization, fill out Bigeye's demo form to talk to us and take Bigeye for a spin.