Databricks
Databricks Data Intelligence Platform is a unified data analytics and AI platform built on Apache Spark that enables organizations to process, analyze, and visualize large-scale data. It provides collaborative notebooks, integrated machine learning tools, and seamless connectivity to cloud data lakes and warehouses, making it ideal for big data and AI-driven workflows.
Prerequisites
The following prerequisites must be met for a user to create and test a successful connection.
- The Databricks server must be accessible from the iceDQ server.
- Valid credentials to access Databricks.
- Databricks JDBC version 2.7 or above.
Cloud Providers
Databricks can be connected to the following cloud platforms.
- Microsoft Azure
- Amazon Web Services
Authentication Mechanisms
The following authentication mechanisms are supported.
- Username and Password
- External OAuth
Connection Properties
Azure Username and Password Authentication
Use one or more properties from the table below to create a valid connection. Properties marked with an asterisk (*) are required.
| Name | Description | Example Values |
|---|---|---|
| Connection Name * | A unique name that identifies the connection. | Prod_sales_data |
| Driver * | Driver used to establish the connection. By default, one driver is available. | Databricks Native JDBC |
| Cloud Provider * | Public cloud of Databricks. | Microsoft Azure |
| Databricks Environment * | The Databricks compute environment. | SQL Warehouse or Cluster |
| Custom JDBC URL | Standardized string used to define the connection details. Use this format supported by the driver: jdbc:databricks://[host]:[port];httpPath=[httpPath] | jdbc:databricks://adb-5465845314.7.azuredatabricks.net:443; httpPath=/sql/1.0/warehouses/saleswh |
| Type * | The connection type, either System Connection or User Connection. Refer Connections for more details. | System |
| Host * | The Databricks workspace hostname. | adb-5465845314.7.azuredatabricks.net |
| Port * | Port number for the Databricks JDBC endpoint. | 443 |
| HTTP Path * | Path for the Databricks REST API endpoint (warehouse or cluster). | /sql/1.0/warehouses/saleswh |
| Catalog | The default catalog to connect to. | SalesAnalyticsDB |
| Personal Access Token * | The PAT used for authentication with Azure Databricks. | dapiXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX |
Azure External OAuth Authentication
Use one or more properties from the table below to create a valid connection. Properties marked with an asterisk (*) are required.
| Name | Description | Example Values |
|---|---|---|
| Connection Name * | A unique name that identifies the connection. | Prod_sales_data |
| Driver * | Driver used to establish the connection. By default, one driver is available. | Databricks Native JDBC |
| Cloud Provider * | Public cloud of Databricks. | Microsoft Azure |
| Databricks Environment * | The Databricks compute environment. | SQL Warehouse or Cluster |
| Custom JDBC URL | Standardized string used to define the connection details. Use this format supported by the driver: jdbc:databricks://[host]:[port];httpPath=[httpPath] | jdbc:databricks://adb-5465845314.7.azuredatabricks.net:443; httpPath=/sql/1.0/warehouses/saleswh |
| Type * | OAuth authentication only supports User Connections. Refer Connections for more details. | User |
| Callback URL * | The redirect/callback URL registered with the OAuth application. | https://app.icedq.net/api/v1/oauth/callback |
| Host * | The Databricks workspace hostname. | adb-5465845314.7.azuredatabricks.net |
| Port * | Port number for the Databricks JDBC endpoint. | 443 |
| HTTP Path * | Path for the Databricks REST API endpoint (warehouse or cluster). | /sql/1.0/warehouses/saleswh |
| Catalog | The default catalog to connect to. | SalesAnalyticsDB |
| Client ID * | The client ID of the external OAuth application. | 0a18762-789b-4cde-9abc-12345ef67890 |
| Secret ID * | The client secret of the external OAuth application. | u8w3D~xYtP-5k9LMn.7abc123xyz456 |
AWS Username and Password Authentication
Use one or more properties from the table below to create a valid connection. Properties marked with an asterisk (*) are required.
| Name | Description | Example Values |
|---|---|---|
| Connection Name * | A unique name that identifies the connection. | Prod_sales_data |
| Driver * | Driver used to establish the connection. By default, one driver is available. | Databricks Native JDBC |
| Cloud Provider * | Public cloud of Databricks. | Amazon Web Services |
| Databricks Environment * | The Databricks compute environment. | SQL Warehouse or Cluster |
| Custom JDBC URL | Standardized string used to define the connection details. Use this format supported by the driver: jdbc:databricks://[host]:[port];httpPath=[httpPath] | jdbc:databricks://dbc-5465845314.aws.databricks.com:443; httpPath=/sql/1.0/warehouses/saleswh |
| Type * | The connection type, either System Connection or User Connection. Refer Connections for more details. | System |
| Host * | The Databricks workspace hostname. | dbc-5465845314.aws.databricks.com |
| Port * | Port number for the Databricks JDBC endpoint. | 443 |
| HTTP Path * | Path for the Databricks REST API endpoint (warehouse or cluster). | /sql/1.0/warehouses/saleswh |
| Catalog | The default catalog to connect to. | SalesAnalyticsDB |
| Personal Access Token * | The PAT used for authentication with AWS Databricks. | dapiXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX |
AWS External OAuth Authentication
Use one or more properties from the table below to create a valid connection. Properties marked with an asterisk (*) are required.
| Name | Description | Example Values |
|---|---|---|
| Connection Name * | A unique name that identifies the connection. | Prod_sales_data |
| Driver * | Driver used to establish the connection. By default, one driver is available. | Databricks Native JDBC |
| Cloud Provider * | Public cloud of Databricks. | Amazon Web Services |
| Databricks Environment * | The Databricks compute environment. | SQL Warehouse or Cluster |
| Custom JDBC URL | Standardized string used to define the connection details. Use this format supported by the driver: jdbc:databricks://[host]:[port];httpPath=[httpPath] | jdbc:databricks://dbc-1234567890.aws.databricks.com:443; httpPath=/sql/1.0/warehouses/saleswh |
| Type * | OAuth authentication only supports User Connections. Refer Connections for more details. | User |
| Callback URL * | The redirect/callback URL registered with the OAuth application. | https://app.icedq.net/api/v1/oauth/callback |
| Host * | The Databricks workspace hostname. | dbc-1234567890.aws.databricks.com |
| Port * | Port number for the Databricks JDBC endpoint. | 443 |
| HTTP Path * | Path for the Databricks REST API endpoint (warehouse or cluster). | /sql/1.0/warehouses/saleswh |
| Catalog | The default catalog to connect to. | SalesAnalyticsDB |
| Client ID * | The client ID of the external OAuth application. | 0a18762-789b-4cde-9abc-12345ef67890 |
| Secret ID * | The client secret of the external OAuth application. | u8w3D~xYtP-5k9LMn.7abc123xyz456 |
Custom Properties
The following optional connection properties can be configured based on user requirements.
| Property | Default Value | Possible Values | Description |
|---|---|---|---|
| Auth_AccessToken | N/A | String | The OAuth 2.0 access token used for connecting to a server. Required if AuthMech=11. |
| Auth_Flow | 0 | 0 = Token Passthrough, 3 = Username & Password, 11 = OAuth 2 | Specifies the OAuth authentication flow type when AuthMech=11. |
| DecimalColumnScale | 10 | Integer | Maximum number of digits to the right of the decimal point for numeric types. |
| LogLevel | 0 | Integer | Enables/disables logging and specifies detail level in log files. |
| LogPath | Current working directory | String (path) | Path to the folder where log files are saved when logging is enabled. |
| ProxyAuth | N/A | 0 = No authentication, 1 = Authentication required | Specifies if the proxy server requires authentication. |
| ProxyHost | N/A | String | IP address or hostname of the proxy server. Required if connecting via proxy. |
| ProxyPort | N/A | Integer | Listening port of the proxy server. Required if connecting via proxy. |
| ProxyPWD | N/A | String | Password for accessing the proxy server. Required if proxy requires authentication. |
| ProxyUID | N/A | String | Username for accessing the proxy server. Required if proxy requires authentication. |
| RowsFetchedPerBlock | 10000 | Positive 32-bit Integer | Maximum number of rows a query returns at a time. Performance gain is marginal beyond 10000. |
| SSL | 1 | 0 = Disabled, 1 = Enabled, 2 = Enabled (two-way auth) | Specifies whether the connector communicates via SSL-enabled sockets. |
Supported Datatypes
The following datatypes are supported:
- ARRAY
- BIGINT
- BOOLEAN
- DATE
- DECIMAL
- DOUBLE
- FLOAT
- INT
- INTERVAL
- MAP
- SMALLINT
- STRUCT
- TIMESTAMP
- TIMESTAMP_NTZ
- TINYINT
- VARCHAR
- VOID
Note: Complex datatypes like ARRAY, MAP, STRUCT are read as TEXT datatype.
Unsupported Datatypes
The following datatypes are not supported:
- BINARY
- OBJECT
- TIMESTAMP_NANOS