Excel
Excel is a widely used spreadsheet application by Microsoft that offers powerful data organization, calculation, and visualization capabilities. Its connector enables seamless integration to import and export data directly from Excel workbooks, simplifying data workflows across business applications.
.xls files are not directly supported. Users must convert them to .xlsx.
Prerequisites
Before connecting, ensure the following prerequisites are met:
- Verify that the storage or folder location is accessible from the cluster.
- Gather valid user credentials and ensure the user has appropriate permissions to read the files.
- Ensure at least one file is present in the folder.
Connecting to Amazon S3
AWS EC2 Role Authentication
Use one or more properties from the table below to create a valid connection. Properties marked with an asterisk (*) are required.
| Name | Description | Example Values |
|---|---|---|
| Connection Name * | A unique name that identifies the connection. | prod_sales_data |
| Driver * | Driver used to establish the connection. By default, one driver is available. | Excel Custom JDBC |
| Container * | The name of the S3 bucket that contains the Excel files. | prod-bucket |
| Folder Path * | The directory path within the bucket that points to the folder containing the Excel files. | /data/excel-files/ |
| Region * | AWS region where the S3 bucket is located. | us-east-1 |
| Test File * | A sample Excel file used to test or validate the connection. | dev_sample_data.xlsx |
| Type * | EC2 role authentication supports only System Connections. Refer Connections for more details. | System |
Connecting to Azure Blob Storage
Azure Access Key Authentication
Use one or more properties from the table below to create a valid connection. Properties marked with an asterisk (*) are required.
| Name | Description | Example Values |
|---|---|---|
| Connection Name * | A unique name that identifies the connection. | prod_sales_data |
| Driver * | Driver used to establish the connection. By default, one driver is available. | Excel Custom JDBC |
| Container * | Name of the storage container in Azure Blob Storage where the Excel files are stored. | excel-data-container |
| Folder Path * | The directory path within the container that points to the folder containing the Excel files. | /data/excel-files/ |
| Test File * | A sample Excel file used to test or validate the connection. | dev_sample_data.xlsx |
| Type * | The connection type, either System Connection or User Connection. Refer Connections for more details. | System |
| Account Name * | The storage account name used for authentication. | prod_storageaccount |
| Access Key * | The key used to authenticate the connection to the storage account. | Wsrgaezbh2576ies4r6u796 |
Azure Service Principal Authentication
Use one or more properties from the table below to create a valid connection. Properties marked with an asterisk (*) are required.
| Name | Description | Example Values |
|---|---|---|
| Connection Name * | A unique name that identifies the connection. | prod_sales_data |
| Driver * | Driver used to establish the connection. By default, one driver is available. | Excel Custom JDBC |
| Container * | The name of the storage container in Azure Blob Storage where the Excel files are stored. | prod-data-container |
| Folder Path * | The directory path within the container that points to the folder containing the Excel files. | /data/excel-files/ |
| Test File * | A sample Excel file used to test or validate the connection. | customer_sample_data.xlsx |
| Type * | Service principal authentication supports only System Connections. Refer Connections for more details. | System |
| Account Name * | The Azure Storage account name used for authentication. | prod_storageaccount |
| Tenant ID * | The Microsoft Entra ID (formerly Azure AD) tenant ID used for authentication. | asrgaw-453a-54y45y-sd56udf-d345 |
| Client ID * | The Application (client) ID registered in Microsoft Entra ID. | asergrdh-3467-fgnjfdg-4668-9544 |
| Client Secret * | The secret associated with the registered client app in Microsoft Entra ID. | mhfguk-zdtnjhgk-56678-qmdur |
Connecting to Google Cloud Storage
Service Account Authentication
Use one or more properties from the table below to create a valid connection. Properties marked with an asterisk (*) are required.
| Name | Description | Example Values |
|---|---|---|
| Connection Name * | A unique name that identifies the connection. | prod_sales_data |
| Driver * | Driver used to establish the connection. By default, one driver is available. | Excel Custom JDBC |
| Project ID * | The unique identifier of the Google Cloud project where the storage bucket resides. | prod-acme-project |
| Container * | Name of the storage container where the Excel files are stored. | excel-data-container |
| Folder Path * | The directory path within the container that points to the folder containing the Excel files. | prod_data/excel-files/ |
| Test File * | A sample Excel file used to test or validate the connection. | prod_sample_data.xlsx |
| Type * | Service account authentication supports only System Connections. Refer Connections for more details. | System |
Connecting to FTP
Anonymous Authentication
Use one or more properties from the table below to create a valid connection. Properties marked with an asterisk (*) are required.
| Name | Description | Example Values |
|---|---|---|
| Connection Name * | A unique name that identifies the connection. | prod_sales_data |
| Driver * | Driver used to establish the connection. By default, one driver is available. | Excel Custom JDBC |
| Host * | IP address or hostname of the FTP server. | 183.169.64.30 |
| Port * | Port on which the server listens. | 21 |
| Folder Path * | The directory path within the FTP server that points to the folder containing the Excel files. | /data/excel-files/ |
| Test File * | A sample Excel file used to test or validate the connection. | sample_data.xlsx |
| Type * | Anonymous authentication supports only System Connections. Refer Connections for more details. | System |
Username and Password Authentication
Use one or more properties from the table below to create a valid connection. Properties marked with an asterisk (*) are required.
| Name | Description | Example Values |
|---|---|---|
| Connection Name * | A unique name that identifies the connection. | prod_sales_data |
| Driver * | Driver used to establish the connection. By default, one driver is available. | Excel Custom JDBC |
| Host * | IP address or hostname of the FTP server. | 192.169.64.30 |
| Port * | Port on which the server listens. | 21 |
| Folder Path * | The directory path within the FTP server that points to the folder containing the Excel files. | /data/excel-files/ |
| Test File * | A sample Excel file used to test or validate the connection. | prod_sample_data.xlsx |
| Type * | The connection type, either System Connection or User Connection. Refer Connections for more details. | System |
| Username * | FTP login username with necessary privileges. | sales_dev_user |
| Password * | Password associated with the specified username. | S@lesDW2025! |
Connecting to SFTP
Username and Password Authentication
Use one or more properties from the table below to create a valid connection. Properties marked with an asterisk (*) are required.
| Name | Description | Example Values |
|---|---|---|
| Connection Name * | A unique name that identifies the connection. | prod_sales_data |
| Driver * | Driver used to establish the connection. By default, one driver is available. | Excel Custom JDBC |
| Host * | IP address or hostname of the SFTP server. | 192.169.64.30 |
| Port * | Port on which the server listens. | 22 |
| Folder Path * | The directory path within the SFTP server that points to the folder containing the Excel files. | /data/excel-files/ |
| Test File * | A sample Excel file used to test or validate the connection. | prod_sample.xlsx |
| Type * | The connection type, either System Connection or User Connection. Refer Connections for more details. | System |
| Username * | SFTP login username with necessary privileges. | mkt_analytics_dev_user |
| Password * | Password associated with the specified username. | M@rketing2025#Dev |
Connecting to Local Storage
Users can read files from local storage, which may refer to a server directory for uploaded files or a network-mounted directory accessible to the cluster. Use one or more properties from the table below to create a valid connection. Properties marked with an asterisk (*) are required.
| Name | Description | Example Values |
|---|---|---|
| Connection Name * | A unique name that identifies the connection. | prod_sales_data |
| Driver * | Driver used to establish the connection. By default, one driver is available. | Excel Custom JDBC |
| Folder Path * | The directory path that points to the folder containing the Excel files. | /data/excel-files/ |
| Test File * | A sample Excel file used to test or validate the connection. | prod_sample_data.xlsx |
The connector does not support authentication for accessing files from local storage.
Custom Properties
The following optional connection properties can be configured based on user requirements.
| Property | Default Value | Possible Values | Description |
|---|---|---|---|
| TypeDetectionScheme | None | None, RowScan, ColumnFormat, ColumnStyle | Determines how the provider detects the data types of columns. |
| Charset | UTF-8 | UTF-8 | Specifies the session character set for encoding and decoding character data transferred to and from the Microsoft Excel file. |
| Orientation | Vertical | Vertical, Horizontal | Indicates whether the data in Excel is laid out horizontally or vertically. |
| Recalculate | true | true, false | Indicates whether to recalculate all formulas when data is read. |
| RowScanDepth | 100 | Integer value | The maximum number of rows to scan to look for the columns available in a table. |
| ShowEmptyRows | false | true, false | Indicates whether or not the empty rows should be pushed. |
| Timeout | 60 | Integer value | Specifies the maximum time, in seconds, that the provider waits for a server response before throwing a timeout error. |
| Skip Row From Top | 0 | Integer value | Skips the number of rows specified starting from the top. |
| Treat as Null Values | String | String value | A comma separated list which will be replaced with nulls if found in the file. |
| Include Headers | true | true, false | Indicates whether the first row should be used as a column header. |
| Read Empty Value as NULL | true | true, false | Indicates whether to read the empty values as empty or as null. |
| MaxRows | -1 | Numeric value | Specifies the maximum rows returned for queries without aggregation or GROUP BY. |
| BatchSize | 0 | Numeric value | Specifies the maximum number of rows included in each batch operation. Set to 0 to submit the entire batch as a single request. |
| ClientCulture | en-US | en-US | This property can be used to specify the format of data that is accepted by the client application. |
| Readonly | false | true, false | Toggles read-only access to Microsoft Excel from the provider. |
Supported Datatypes
The following data types are supported:
- INTEGER
- BIGINT
- SHORT
- FLOAT
- VARCHAR
- CHAR
- BOOLEAN / BIT
- DATE
- DATETIME
- TIME
- TIMESTAMP
Unsupported Datatypes
The following data types are not supported:
- BINARY
- ARRAY
- STRUCT
- XML
- JSON