The Quality Issue
Big Data is growing at a rapid pace. According to IBM, 90% of the world’s data has been created in the past 2 years. And with Big Data comes bad data. Analyst firm Gartner says the average organization loses $14.2 million annually through poor Data Quality. Experian Data Quality report states 99% of organizations have a data quality strategy in place. This is disturbing in that these Data Quality practices are not finding the bad data that exists in their Big Data
Current Data Quality & Testing Trends in Big Data
According to Gartner’s Magic Quadrant on Data Quality Tools, characteristics of these tools are: profiling, parsing cleansing, masking, matching, and monitoring of big data. None these characteristics deal with data validation in your Big Data store. Big Data testing is completely different. The primary goals of data testing your Big Data are verifying data completeness, ensure data transformation, ensure data quality, automate the regression testing. But the 2 main methods, Sampling (also known as “stare and compare”) and Minus Queries, have major flaws.
DvT - Data Validation Tool
Ensure data quality with Data Validation Tool. DvT is the collaborative data testing solution that finds bad data in Big Data and provides a holistic view of your data's health. It ensures that the data you extract from sources remains intact in the target by analyzing and quickly pinpointing any differences in your Big Data at every touchpoint.