How to test Reference Data using Reconciliation Rule?
In this use case, we will demonstrate how to test the reconciliation of reference data generated by an ETL (Extract, Transform, Load) process in the database with the Master reference data in the reference database. By following these steps, you can ensure the accuracy and consistency of your ETL-generated data.
Steps
Below steps have been followed in the video.
Create Rule
To begin, create a reconciliation rule in iceDQ. Give the rule an appropriate name. In this example, we will verify the reconciliation of phone number types between the source and target databases.
Set up Source
Set up the source connection to the database. Select the source schema and the table "Phone Number Type". Preview the data to identify the phone number types present in the source database.
Set up Target
Set up the target connection to the reference database. Select the table "Phone Number Type" in the reference database. Since the column names are the same in both databases, we can directly connect them.
Add Checks
Add checks to compare the relevant columns between the source and target databases. In this example, we compare the name and modified date columns. The checks verify that the names in the source data match the names in the reference data, and the modified dates align between the two datasets.
Publish & Run
Publish the rule and execute it. After execution, review the results. If there are any mismatches, investigate further by clicking on the instance ID. In this case, we find that while the name comparison is successful, there are three mismatches in the modified date, indicating differences between the source and reference data.
Video: How to test Reference Data using Reconciliation Rule?
Conclusion
By following this use case with iceDQ, you can effectively verify the reconciliation of ETL-generated data with Master reference data. This helps ensure the accuracy and consistency of your data, enabling reliable data-driven insights and decision-making. Regular data reconciliation is crucial to maintain data integrity and identify any discrepancies in the ETL process.