Skip to main content

Databricks


Databricks Data Intelligence Platform is a unified data analytics and AI platform built on Apache Spark that enables organizations to process, analyze, and visualize large-scale data. It provides collaborative notebooks, integrated machine learning tools, and seamless connectivity to cloud data lakes and warehouses, making it ideal for big data and AI-driven workflows.


Prerequisites

The following prerequisites must be met for a user to create and test a successful connection.

  • The Databricks server must be accessible from the iceDQ server.
  • Valid credentials to access Databricks.
  • Databricks JDBC version 2.7 or above.

Cloud Providers

Databricks can be connected to the following cloud platforms.

  • Microsoft Azure
  • Amazon Web Services

Authentication Mechanisms

The following authentication mechanisms are supported.

  • Username and Password
  • External OAuth

Connection Properties

Azure Username and Password Authentication

Use one or more properties from the table below to create a valid connection. Properties marked with an asterisk (*) are required.

NameDescriptionExample Values
Connection Name *A unique name that identifies the connection.Prod_sales_data
Driver *Driver used to establish the connection. By default, one driver is available.Databricks Native JDBC
Cloud Provider *Public cloud of Databricks.Microsoft Azure
Databricks Environment *             The Databricks compute environment.SQL Warehouse or Cluster
Custom JDBC URLStandardized string used to define the connection details. Use this format supported by the driver: jdbc:databricks://[host]:[port];httpPath=[httpPath]jdbc:databricks://adb-5465845314.7.azuredatabricks.net:443; httpPath=/sql/1.0/warehouses/saleswh
Type *The connection type, either System Connection or User Connection. Refer Connections for more details.System
Host *The Databricks workspace hostname.adb-5465845314.7.azuredatabricks.net
Port *Port number for the Databricks JDBC endpoint.443
HTTP Path *Path for the Databricks REST API endpoint (warehouse or cluster)./sql/1.0/warehouses/saleswh
CatalogThe default catalog to connect to.SalesAnalyticsDB
Personal Access Token *The PAT used for authentication with Azure Databricks.dapiXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Azure External OAuth Authentication

Use one or more properties from the table below to create a valid connection. Properties marked with an asterisk (*) are required.

NameDescriptionExample Values
Connection Name *A unique name that identifies the connection.Prod_sales_data
Driver *Driver used to establish the connection. By default, one driver is available.Databricks Native JDBC
Cloud Provider *Public cloud of Databricks.Microsoft Azure
Databricks Environment *             The Databricks compute environment.SQL Warehouse or Cluster
Custom JDBC URLStandardized string used to define the connection details. Use this format supported by the driver: jdbc:databricks://[host]:[port];httpPath=[httpPath]jdbc:databricks://adb-5465845314.7.azuredatabricks.net:443; httpPath=/sql/1.0/warehouses/saleswh
Type *OAuth authentication only supports User Connections. Refer Connections for more details.User
Callback URL *The redirect/callback URL registered with the OAuth application.https://app.icedq.net/api/v1/oauth/callback
Host *The Databricks workspace hostname.adb-5465845314.7.azuredatabricks.net
Port *Port number for the Databricks JDBC endpoint.443
HTTP Path *Path for the Databricks REST API endpoint (warehouse or cluster)./sql/1.0/warehouses/saleswh
CatalogThe default catalog to connect to.SalesAnalyticsDB
Client ID *The client ID of the external OAuth application.0a18762-789b-4cde-9abc-12345ef67890
Secret ID *The client secret of the external OAuth application.u8w3D~xYtP-5k9LMn.7abc123xyz456

AWS Username and Password Authentication

Use one or more properties from the table below to create a valid connection. Properties marked with an asterisk (*) are required.

NameDescriptionExample Values
Connection Name *A unique name that identifies the connection.Prod_sales_data
Driver *Driver used to establish the connection. By default, one driver is available.Databricks Native JDBC
Cloud Provider *Public cloud of Databricks.Amazon Web Services
Databricks Environment *             The Databricks compute environment.SQL Warehouse or Cluster
Custom JDBC URLStandardized string used to define the connection details. Use this format supported by the driver: jdbc:databricks://[host]:[port];httpPath=[httpPath]jdbc:databricks://dbc-5465845314.aws.databricks.com:443; httpPath=/sql/1.0/warehouses/saleswh
Type *The connection type, either System Connection or User Connection. Refer Connections for more details.System
Host *The Databricks workspace hostname.dbc-5465845314.aws.databricks.com
Port *Port number for the Databricks JDBC endpoint.443
HTTP Path *Path for the Databricks REST API endpoint (warehouse or cluster)./sql/1.0/warehouses/saleswh
CatalogThe default catalog to connect to.SalesAnalyticsDB
Personal Access Token *The PAT used for authentication with AWS Databricks.dapiXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

AWS External OAuth Authentication

Use one or more properties from the table below to create a valid connection. Properties marked with an asterisk (*) are required.

NameDescriptionExample Values
Connection Name *A unique name that identifies the connection.Prod_sales_data
Driver *Driver used to establish the connection. By default, one driver is available.Databricks Native JDBC
Cloud Provider *Public cloud of Databricks.Amazon Web Services
Databricks Environment *             The Databricks compute environment.SQL Warehouse or Cluster
Custom JDBC URLStandardized string used to define the connection details. Use this format supported by the driver: jdbc:databricks://[host]:[port];httpPath=[httpPath]jdbc:databricks://dbc-1234567890.aws.databricks.com:443; httpPath=/sql/1.0/warehouses/saleswh
Type *OAuth authentication only supports User Connections. Refer Connections for more details.User
Callback URL *The redirect/callback URL registered with the OAuth application.https://app.icedq.net/api/v1/oauth/callback
Host *The Databricks workspace hostname.dbc-1234567890.aws.databricks.com
Port *Port number for the Databricks JDBC endpoint.443
HTTP Path *Path for the Databricks REST API endpoint (warehouse or cluster)./sql/1.0/warehouses/saleswh
CatalogThe default catalog to connect to.SalesAnalyticsDB
Client ID *The client ID of the external OAuth application.0a18762-789b-4cde-9abc-12345ef67890
Secret ID *The client secret of the external OAuth application.u8w3D~xYtP-5k9LMn.7abc123xyz456

Custom Properties

The following optional connection properties can be configured based on user requirements.

PropertyDefault ValuePossible ValuesDescription
Auth_AccessToken       N/AStringThe OAuth 2.0 access token used for connecting to a server. Required if AuthMech=11.
Auth_Flow00 = Token Passthrough, 3 = Username & Password, 11 = OAuth 2Specifies the OAuth authentication flow type when AuthMech=11.
DecimalColumnScale10IntegerMaximum number of digits to the right of the decimal point for numeric types.
LogLevel0IntegerEnables/disables logging and specifies detail level in log files.
LogPathCurrent working directoryString (path)Path to the folder where log files are saved when logging is enabled.
ProxyAuthN/A0 = No authentication, 1 = Authentication requiredSpecifies if the proxy server requires authentication.
ProxyHostN/AStringIP address or hostname of the proxy server. Required if connecting via proxy.
ProxyPortN/AIntegerListening port of the proxy server. Required if connecting via proxy.
ProxyPWDN/AStringPassword for accessing the proxy server. Required if proxy requires authentication.
ProxyUIDN/AStringUsername for accessing the proxy server. Required if proxy requires authentication.
RowsFetchedPerBlock10000Positive 32-bit IntegerMaximum number of rows a query returns at a time. Performance gain is marginal beyond 10000.
SSL10 = Disabled, 1 = Enabled, 2 = Enabled (two-way auth)Specifies whether the connector communicates via SSL-enabled sockets.

Supported Datatypes

The following datatypes are supported:

  • ARRAY
  • BIGINT
  • BOOLEAN
  • DATE
  • DECIMAL
  • DOUBLE
  • FLOAT
  • INT
  • INTERVAL
  • MAP
  • SMALLINT
  • STRUCT
  • TIMESTAMP
  • TIMESTAMP_NTZ
  • TINYINT
  • VARCHAR
  • VOID

Note: Complex datatypes like ARRAY, MAP, STRUCT are read as TEXT datatype.


Unsupported Datatypes

The following datatypes are not supported:

  • BINARY
  • OBJECT
  • TIMESTAMP_NANOS