Skip to main content

How to Validate Tables after Data Migration between SQL Server and Snowflake?


In this use case, we will demonstrate how to compare data from two different database sources: SQL Server and Snowflake. We will achieve this using a reconciliation rule in iceDQ. Let's dive in and create a reconciliation rule for this purpose.

Steps

Here are the steps outlined in the video:

Create Reconciliation Rule

  1. Start by creating a new reconciliation rule. Choose the folder where you want to create the rule and give it an appropriate name. Select a reconciliation template and provide a description and purpose for the rule. In this example, we are comparing customer data between SQL Server and Snowflake to ensure an exact match.

Connect to Source Data Set

  1. Configure a source data connection with the connection type as "Database." Connect to the "Adventure Works" database, select the "DBO" schema, and choose the "dim customer" table. Preview the data to see customer-level information, including first name, last name, gender, birth date, and more.

Connect to Target Data Set

  1. Configure a target data connection with the same connection type as the source. Connect to the "Snowflake DB," select the "Public" schema, and choose the "dim customer" table. Preview the data, noting that it contains similar customer-level information as the source. However, observe a discrepancy in the "date first purchased" column, where the data type differs between the source ("Date") and target ("Varchar"). Modify the query to convert the "date first purchased" column to "YY/MM/DD" format.

Define Diff Join Condition

  1. Create a Diff Join condition since the source and target column names match. Connect them using the column names. Set the primary key and add all other columns as checks.

Address Data Type Discrepancies

  1. Observe data type discrepancies in the "phone number" and "yearly income" columns between source and target. The "phone number" in the source is "bigint," while in the target, it is "varchar." Similarly, the "yearly income" is "int" in the source and "varchar" in the target. Update the check and use a Groovy function, "toString," to change the data type for the "phone number" and "yearly income" columns.

Publish & Execute

  1. Publish the rule and run it.

Review Results

  1. After execution, review the results. In the example provided, the status shows as "warning," with 20 failures in the table. The source count is 98, the target count is 98, and 20 failures are identified. Investigate further by clicking on the instance ID.

  2. Inspect the mismatches in the gender column, where "F" appears in the source and "M" in the target. Additionally, review failures in the phone number column, indicating discrepancies in some phone numbers.

Video: How to Validate Tables after Data Migration between SQL Server and Snowflake?

Conclusion

By performing this reconciliation using iceDQ, you have successfully validated the data between the source (SQL Server) and target (Snowflake) databases. Identifying and addressing data discrepancies enhances data quality and ensures the accuracy of your data integration processes.