Impala
Apache Impala is a massively parallel processing (MPP) SQL query engine for Hadoop that enables low-latency, interactive analysis directly on data stored in HDFS and Apache Hive. It provides high performance for BI and analytics by avoiding MapReduce and using native execution.
Prerequisites
The following prerequisites must be met for a user to create and test a successful connection.
- The Impala server must be accessible from the iceDQ server.
- Valid credentials to access the database.
- Impala JDBC version 42 or above.
Authentication Mechanisms
The following authentication mechanisms are supported.
- Username & Password
- Kerberos
- Kerberos Ticket Cache
Connection Properties
Username and Password Authentication
Use the following properties to create a valid connection. Properties marked with an asterisk (*) are required.
Name | Description | Example Values |
---|---|---|
Connection Name * | Name that uniquely identifies the connection. | Impala_Prod_conn |
Driver * | Driver used to establish the connection. One driver is available by default. | Cloudera Impala Native JDBC |
Custom JDBC URL | Standardized string used to define the connection details. Use this format supported by the driver: jdbc:Impala://[host]:[port]/[database] | jdbc:impala://192.168.61.90:21050/dev_db |
Use SSL | Secure Sockets Layer (SSL) option enables encrypted communication from iceDQ to Impala. Refer the SSL section below for setup instructions. | |
Host * | IP address or hostname of the Impala server. | impala.cloudera.acme.com or 192.168.26.45 |
Port * | Port on which the Impala Server listens. Default is 21050 for Impala. | 21050 |
Database * | Name of the Impala database. | dev_db |
Type * | The connection type – either System Connection or User Connection. Refer to Connections for more details. | System or User |
Username * | Impala login username with necessary privileges. | john_doe |
Password * | Password associated with the specified username. | impala_password |
Kerberos Authentication
Use the following properties to create a valid connection. Properties marked with an asterisk (*) are required.
Name | Description | Example Values |
---|---|---|
Connection Name * | Name that uniquely identifies the connection. | Impala_Dev_Conn |
Driver * | Driver used to establish the connection. By default, one driver is available. | Cloudera Impala Native JDBC |
Custom JDBC URL | Full JDBC URL for the connection. Optional if Host, Port, and Database are provided separately. Example format jdbc:impala://[host]:[port]/[database];KrbHostFQDN=[host];sslTrustStore=${ssl_trust_store_path}; SSLTrustStorePwd=${trust_store_password} | jdbc:impala://impala.acme.com:21050/dev_db; |
Kerberos Config * | Path to the Kerberos configuration file (typically krb5.conf ). | User needs to upload the config file. |
Use SSL | Secure Sockets Layer (SSL) option enables encrypted communication from iceDQ to Impala. Refer the SSL section below for setup instructions. | |
Service Principal | The unique Kerberos identity of the Impala service (usually in the format service/host@REALM ). Used to request service tickets. | impala/impala.[email protected] |
Host * | IP address or hostname of the Impala server. | impala.acme.com |
Port * | Port on which the Impala daemon listens. Default is 21050. | 21050 |
Database * | Name of the Impala database to connect to. | dev_db |
Type * | The connection type – either System Connection or User Connection. Refer to Connections for more details. | System or User |
User Principal * | Kerberos principal name of the user (e.g., user@REALM). | [email protected] |
User Keytab * | Path to the keytab file that contains the encrypted credentials of the user principal. Used for authentication. | User needs to upload the keytab file. |
Kerberos Ticket Cache Authentication
Use the following properties to create a valid connection. Properties marked with an asterisk (*) are required.
Name | Description | Example Values |
---|---|---|
Connection Name * | Name that uniquely identifies the connection. | Impala_Demo_Conn |
Driver * | Driver used to establish the connection. By default, one driver is available. | Cloudera Impala Native JDBC |
Custom JDBC URL | Full JDBC URL for the connection. Optional if Host, Port, and Database are provided separately. Example format jdbc:impala://[host]:[port]/[database];KrbHostFQDN=[host];sslTrustStore=${ssl_trust_store_path}; SSLTrustStorePwd=${trust_store_password} | jdbc:impala://impala.acme.com:21050/default; |
Kerberos Config * | Path to the Kerberos configuration file (typically krb5.conf ). | User needs to upload the config file. |
Use SSL | Secure Sockets Layer (SSL) option enables encrypted communication from iceDQ to Impala. Refer the SSL section below for setup instructions. | |
Service Principal | The unique Kerberos identity of the Impala service (usually in the format service/host@REALM ). Used to request service tickets. | impala/impala.[email protected] |
Host * | IP address or hostname of the Impala server. | impala.acme.com |
Port * | Port on which the Impala daemon listens. Default is 21050. | 21050 |
Database * | Name of the Impala database to connect to. | default |
Type * | The connection type – either System Connection or User Connection. Refer to Connections for more details. | System or User |
User Principal * | Kerberos principal name of the user (e.g., user@REALM). | [email protected] |
Ticket Cache File Name * | Path to the Kerberos ticket cache. | User needs to upload the keytab file. |
SSL Configuration
The Secure Sockets Layer (SSL) option enables encrypted communication between iceDQ and the Hive server. iceDQ supports SSL through a Java Truststore. Use the following properties to configure SSL. Properties marked with an asterisk (*) are required.
Name | Description | Example Values |
---|---|---|
Java Truststore | A valid .jks file (Java Keystore) containing trusted SSL certificates. | User needs to upload the Truststore file. |
Java Truststore Password | Password used to access the Java Truststore. Optional if the truststore does not require one. | my_password |
JAAS Properties
With respect to Kerberos authentication, JAAS properties refer to configuration settings used by the Java Authentication and Authorization Service (JAAS) to authenticate a user or application using Kerberos. For a comprehensive list of supported properties, refer to the JAAS Reference section of the Cloudera documentation.
Custom Properties
Custom properties are optional connection parameters in the Hive driver that allow customization of settings such as timeouts and proxy configurations. A list of supported properties is available here. The availability and behavior of custom connection properties may vary depending on the Hive JDBC driver version in use.
Supported Datatypes
The following datatypes are supported:
- ARRAY
- BIGINT
- BINARY
- BOOLEAN
- CHAR
- DATE
- DECIMAL
- DOUBLE
- FLOAT
- INT
- MAP
- SMALLINT
- STRUCT
- TIMESTAMP
- TINYINT
- VARCHAR