Manu Bansal is the co-founder and CEO of Lightup Data, developers of a breakthrough data quality monitoring solution.
As businesses increasingly adopt data-driven systems, processes and strategies and with leaders seeking to “Moneyball” everything from hiring to software development, the importance of maintaining data quality has never been higher. However, with data volumes steadily rising, maintaining the quality of the underlying data that drives decisions is a growing challenge.
Fast-growing technologies such as artificial intelligence (AI), machine learning (ML) and Internet of Things (IoT) now add massive volumes of data to enterprise systems every minute of every day.
IDC forecasts that the amount of data “created, captured, copied and consumed in the world” will continue to climb at a breakneck pace. The market-research firm estimates that the amount of data created over the next three years will be more than all the data created over the past 30 years, and the world will create more than three times the data over the next five years than it did in the previous five.
The High Cost Of Low-Quality Data
Business leaders are starting to notice the impact of bad data on their bottom lines. According to a survey by research firm Gartner, “organizations believe poor data quality to be responsible for an average of $15 million per year in losses.” Gartner also found that nearly 60% of those surveyed didn’t know how much bad data costs their businesses because they don’t measure it in the first place.
MORE FOR YOU
A 2016 study by IBM is even more eye-popping. IBM found that poor data quality strips $3.1 trillion from the U.S. economy annually due to lower productivity, system outages and higher maintenance costs, to name only a handful of the bad outcomes that result from poor data quality.
Similarly, Forrester Research has found that the persistence of low-quality data throughout enterprise systems robs business leaders of productivity, as they must continuously vet data to ensure it remains accurate. Forrester also found that “less than 0.5% of all data is ever analyzed and used” and estimates that if the typical Fortune 1000 business were able to increase data accessibility by just 10%, it would generate more than $65 million in additional net income.
A word of caution: There are many reasons data utilization may be low. A major one is low quality. A recent study on the lack of trust around data from research firm Vanson Bourne (commissioned by SnapLogic) found that 91% of IT decision-makers believe they need to improve the quality of data in their organizations, while 77% said they lack trust in their organization’s business data.
Increasing data accessibility could well benefit Fortune 1000 businesses in theory. In practice, data access alone won’t provide the lift.
Mistake Fares, Algorithmic Bias And Flawed Credit Scores
In the airline industry, bad data causes problems so frequently that travel experts scour booking sites for “mistake fares.” The bad data that triggers mistake fares come from a variety of sources, making a recurring pattern look like a series of disconnected, one-off accidents. Human error, currency miscalculations and software glitches have all generated mistake fares, presenting airlines with a Catch-22: They can either lose revenue honoring the mistake fares, or they can generate bad PR when they don’t.
A more pervasive example of how bad data can affect business, one that potentially touches every consumer, is credit scores. If the data fed into credit-scoring algorithms is flawed, problems can pile up quickly. Financial companies now rely on big data tools that vacuum up information from retail purchases, travel habits and even social media. Much of the data is purchased from data brokers that operate with little oversight and no incentive to maintain quality data. Instead, these brokers are in a volume business.
Potential borrowers labeled as “risky” based on flawed data could find that “algorithmic bias” sticks to them. With little transparency into how those determinations are made, consumers have no easy way to fix the problem. When bad data is amplified within these systems, consumers may be pushed into a financial death spiral, with no access to credit and other financial services but also no understanding of why that is. As a result, those blacklisted may be denied mortgages, auto loans and even apartment rentals — all because of bad data.
How To Ensure Data Quality
Here are three critical steps to help ensure your organization can avoid these bad data issues:
1. Pay attention to larger patterns.
When bad data hurts your organization, it is important not to assume that this is just an isolated, one-time event. Instead, business leaders should zoom out to scrutinize the larger patterns in play.
Find out whether or not your organization has the tools in place to monitor and measure data quality. Otherwise, your next bad data problem could already be lurking in your systems.
2. Move on from legacy tools that can’t keep pace with modern problems.
For each bad data-driven outcome listed above, the impacted companies most likely had standard IT and Application Performance Management (APM) tools in place, but the bad data slipped past them. No monitoring tools even detected any sort of degradation in infrastructure or applications.
To proactively mitigate the bad data problem, businesses require modern data management tools that provide visibility into the entire data lifecycle, from creation all the way to presentation on end-user devices — and back.
3. Treat your data stack as critical infrastructure.
As businesses become ever-more data-driven, quality data has become a mission-critical asset. Treat it as such.
That may mean adopting modern data architectures, such as ELT-based ones, or it could mean adopting technologies like data pipelines, warehouses and lakes. It especially means finding data management tools that are capable of monitoring data across all of your assets, while ideally using AI and/or ML to find problems automatically, without human intervention.
The bad data problem isn’t going away any time soon, but if your organization follows these three steps, you’ll be better able to spot bad data and remediate it before it grows into system outages, lost revenue or bad publicity.