Skip to main content

Read Delimited File


Users can read data from delimited files using our Flat File Native Connector. To access the specific customer data mentioned in this article, download the following flat files customer_12182023.txt , customer_12202023.txt.

note

Currently supported extensions are .txt, .csv, .dat. In future release .tar, .tar.gz will be supported.


Understanding Settings (Source/ Target)

Flat File Native connector has specific configurations. Please read the following properties carefully to understand the behaviour of the connector.

PropertyDescriptionNote
File TypeSpecify the file format: delimited or fixed width.
Single FileEnable, if reading data from one file.If disabled, the driver reads and unifies all flat files in the folder, requiring identical structure.
Parameterised PathAllows users to parameterize file path and name using parameter keyRule must have a default parameter file.
Column DelimiterString used to separate the columns.Default is comma (,). Only applicable for Delimited files.
Row DelimiterString used to separate the rows.Default is newline character (\n)
Text QualifierCharacter used to enclose the column names and it's data.Ensures accurate field interpretation by handling default delimiters in delimited files.
Skip Row From TopNumber of rows to SKIP before reading the data.Default value is zero (0)
Header RowRow number where the header row exists.Default value is zero (0)
Skip Row From BottomNumber of rows to SKIP from reading from bottom.Default value is zero (0)
Trim SpacesIf enabled, TRIMS the leading and trailing spaces.
Read Empty Value as NULLIf enabled, treats an empty value as NULL
Read White Space as NULLIf enabled, treats a white space as NULL
Treat as NULL valueSpecify the fixed string that should be treated as NULLFor example, specify "NA", if you want the string in the file should be treated as NULL.

Filter Condition

To filter data within the flat file before validation or comparison, provide a Groovy expression that evaluates to either TRUE or FALSE using the file's columns. Records for which the expression evaluates to TRUE will be included, while those with FALSE will be retained.

For instance, to read only records with "USA" in the country column, use the expression S.[country] == 'USA'.

Generate Schema/Refresh Schema

When setting up the file for the first time always click on Generate Schema for the connector to read the data from file based on the specific configurations. You must also click Refresh Schema whenever they add or change a property.

Advance Schema

Users can change the default column names and datatypes identified by the connector in the Advance tab

  1. Update the column name, datatype and format attributes in the JSON
  2. Click Preview Data to read the data with applied changes
IMPORTANT

Do not click Refresh Schema, as it will revert manual schema modifications.

Below is a sample JSON structure for defining column sequence, name, datatype, and format within a "columns" array.

{
"columns": [
{
"index": 1,
"name": "customerid",
"datatype": "numeric"
},
{
"index": 2,
"name": "acctnumber",
"datatype": "Text"
},
{
"index": 3,
"name": "birthdate",
"datatype": "date",
"format": "MM/dd/yy"
}
]
}

How To: Read from a Delimited File

Following video shows how to read a delimited flat file using the native file connector (no sql).


How To: Parameterize Filename, Read from Multiple Files

Following video shows parameterize file name or read from multiple files in a folder using the native file connector (no sql).


Considerations

  • Following datatypes are supported: text, numeric, date, time, timestamp and boolean.
  • Structure of all the files in a folder should be same when reading multiple files at once.
  • You can only use one parameterization method at a time: either specify a key or use a wildcard.