Data validation means checking the accuracy and quality of source data before using, importing or otherwise processing data. Different types of validation can be performed depending on destination constraints or objectives. Data validation is a form of data cleansing.
Why perform data validation?
When moving and merging data it’s important to make sure data from different sources and repositories will conform to business rules and not become corrupted due to inconsistencies in type or context. The goal is to create data that is consistent, accurate and complete so to prevent data loss and errors during a move.
When is data validation performed?
In data warehousing, data validation is often performed prior to the ETL (Extraction Translation Load) process. A data validation test is performed so that analyst can get insight into the scope or nature of data conflicts. Data validation is a general term and can be performed on any type of data, however, including data within a single application (such as Microsoft Excel) or when merging simple data within a single data store.