Read Delimited File

Users can read data from delimited files using our Flat File Native Connector. To access the specific customer data mentioned in this article, download the following flat files customer_12182023.txt , customer_12202023.txt.

note

Currently supported extensions are .txt, .csv, .dat. In future release .tar, .tar.gz will be supported.

Understanding Settings (Source/ Target)

Flat File Native connector has specific configurations. Please read the following properties carefully to understand the behaviour of the connector.

Property	Description	Note
File Type	Specify the file format: delimited or fixed width.
Single File	Enable, if reading data from one file.	If disabled, the driver reads and unifies all flat files in the folder, requiring identical structure.
Parameterised Path	Allows users to parameterize file path and name using parameter key	Rule must have a default parameter file.
Column Delimiter	String used to separate the columns.	Default is comma (,). Only applicable for Delimited files.
Row Delimiter	String used to separate the rows.	Default is newline character (\n)
Text Qualifier	Character used to enclose the column names and it's data.	Ensures accurate field interpretation by handling default delimiters in delimited files.
Skip Row From Top	Number of rows to SKIP before reading the data.	Default value is zero (0)
Header Row	Row number where the header row exists.	Default value is zero (0)
Skip Row From Bottom	Number of rows to SKIP from reading from bottom.	Default value is zero (0)
Trim Spaces	If enabled, TRIMS the leading and trailing spaces.
Read Empty Value as NULL	If enabled, treats an empty value as NULL
Read White Space as NULL	If enabled, treats a white space as NULL
Treat as NULL value	Specify the fixed string that should be treated as NULL	For example, specify "NA", if you want the string in the file should be treated as NULL.

Filter Condition

To filter data within the flat file before validation or comparison, provide a Groovy expression that evaluates to either TRUE or FALSE using the file's columns. Records for which the expression evaluates to TRUE will be included, while those with FALSE will be retained.

For instance, to read only records with "USA" in the country column, use the expression S.[country] == 'USA'.

Generate Schema/Refresh Schema

When setting up the file for the first time always click on Generate Schema for the connector to read the data from file based on the specific configurations. You must also click Refresh Schema whenever they add or change a property.

Advance Schema

Users can change the default column names and datatypes identified by the connector in the Advance tab

Update the column name, datatype and format attributes in the JSON
Click Preview Data to read the data with applied changes

IMPORTANT

Do not click Refresh Schema, as it will revert manual schema modifications.

Below is a sample JSON structure for defining column sequence, name, datatype, and format within a "columns" array.

{
  "columns": [
    {
      "index": 1,
      "name": "customerid",
      "datatype": "numeric"
    },
    {
      "index": 2,
      "name": "acctnumber",
      "datatype": "Text"
    },
    {
      "index": 3,
      "name": "birthdate",
      "datatype": "date",
      "format": "MM/dd/yy"
    }
  ]
}

How To: Read from a Delimited File

Following video shows how to read a delimited flat file using the native file connector (no sql).

How To: Parameterize Filename, Read from Multiple Files

Following video shows parameterize file name or read from multiple files in a folder using the native file connector (no sql).

Considerations

Following datatypes are supported: text, numeric, date, time, timestamp and boolean.
Structure of all the files in a folder should be same when reading multiple files at once.
You can only use one parameterization method at a time: either specify a key or use a wildcard.

Understanding Settings (Source/ Target)​

Filter Condition​

Generate Schema/Refresh Schema​

Advance Schema​

How To: Read from a Delimited File​

How To: Parameterize Filename, Read from Multiple Files​

Considerations​