How to Validate a Flat File?
In this use case, we will guide you through the process of validating data in a CSV file using iceDQ. Our objective is to validate the e-mail column, ensuring the absence of null values and adherence to a predefined format. By following these step-by-step instructions, you can effectively validate your data and identify any inconsistencies.
Steps
Below steps have been followed in the video.
Create Rule
We start by creating a new validation rule within the iceDQ platform. Select the source connection for the CSV file, specifying the column and row delimiters. Set the header row number accordingly.
Verify File Schema
Next, review the schema and data of the CSV file to identify the e-mail column for validation.
Add Checks
Implement two checks to validate the e-mail column. Verify that the e-mail pattern aligns with the expected format by selecting the appropriate pattern from the provided list. Additionally, apply a custom format check to ensure the absence of null values.
Add Columns
For easy identification of failed validations, add a source column for customer information. This helps track which customer records did not pass the validations.
Publish & Run
Review the added checks and publish the rule. Execute the rule to validate the data in the CSV file.
Review Result
After execution, review the results. Identify the exact records and specific checks that failed. In this case, failures are due to patterns not matching the expected format, with no null value issues.
Video: How to Validate a Flat File?
Conclusion
By following this use case, you can successfully validate the e-mail column in a CSV file using iceDQ. Identifying failed validations allows you to address data inconsistencies and ensure data quality. Applying similar approaches to other columns or data sources enhances data accuracy and reliability in your projects.