How to Test Referential Data Integrity?
In this use case, we will explore how to perform referential integrity testing using iceDQ. Our goal is to ensure that all foreign keys in the child table exist in the parent table, avoiding any inconsistencies or orphaned records. By following these steps, you can easily verify the referential integrity of your database.
Create Rule
To begin, create a reconciliation rule in iceDQ. Give the rule an appropriate name. In this example, we will test the referential integrity between the "ProductInventory" and "Product" tables.
Set up Connections
Set up the source connection to the database. Select the schema "Production" and the table "ProductInventory". This is the child table containing the foreign keys we want to validate. Set up the target connection using the same schema and the parent table "Product". Preview the data to familiarize yourself with the records.
Add Join Condition
Define the join condition for the comparison. Since the common column is "ProductID", set it as the join condition to link the two tables.
Add Checks
Configure the checks for the reconciliation rule. In this scenario, select the A minus B check to verify if there are any product IDs in the source (ProductInventory) that are not present in the target (Product).
Publish & Run
Publish the rule and execute it. After execution, review the results. In this case, we observe a warning indicating a discrepancy between the source and target counts. The source count shows 432 distinct product IDs, while the target count displays 500 products in the Product table.
Video: How to Test Referential Data Integrity?
Conclusion
By utilizing iceDQ, we successfully performed referential integrity testing between the "ProductInventory" and "Product" tables. By identifying four product IDs present in the child table but missing from the parent table, we confirmed the validity of referential integrity. Regular referential integrity testing helps ensure the integrity and consistency of your database, preventing orphaned records and maintaining data reliability.