Hive
Apache Hive is a data warehouse system built on top of Hadoop that provides a SQL-like query language (HiveQL) to analyze and manage large datasets stored in HDFS or compatible storage systems. It simplifies big data processing by allowing users to write queries instead of complex MapReduce programs.
Prerequisites
The following prerequisites must be met to create and test a successful connection.
- The Hive server must be accessible from iceDQ server.
- Valid credentials to access the database.
- Hive Cloudera JDBC version 4 or above.
Authentication Mechanisms
The following authentication mechanisms are supported.
- Username & Password
- Kerberos
- Kerberos Ticket Cache
Connection Properties
Username and Password Authentication
Use the following properties to create a valid connection. Properties marked with an asterisk (*) are required.
Name | Description | Example Values |
---|---|---|
Connection Name * | A unique name that identifies the connection. | Hive_UAT_conn |
Driver * | Driver used to establish the connection. One driver is available by default. | Cloudera Hive Native JDBC |
Custom JDBC URL | Standardized string used to define the connection details. Use this format supported by the driver: jdbc:hive2://[host]:[port]/[database] | jdbc:hive2://hive2.acme.com:10000/acme_db |
Use SSL | Enables encrypted communication between iceDQ and Hive. Refer to the SSL section below for setup instructions. | true/false |
Host * | IP address or hostname of the Hive server. | hive2.acme.com |
Port * | Port on which the Hive server listens. Default is 10000. | 10000 |
Database * | Name of the Hive database. | acme_db |
Type * | The connection type, either System Connection or User Connection. Refer to Connections for more details. | System or User |
Username * | Hive login username with necessary privileges. | john_doe |
Password * | Password associated with the specified username. | Admin@123 |
Kerberos Authentication
Use the following properties to create a valid connection. Properties marked with an asterisk (*) are required.
Name | Description | Example Values |
---|---|---|
Connection Name * | A unique name that identifies the connection. | Hive_UAT_conn |
Driver * | Driver used to establish the connection. By default, one driver is available. | Cloudera Hive Native JDBC |
Custom JDBC URL | Standardized string used to define the connection details. Use this format supported by the driver: jdbc:hive2://[host]:[port]/[database] | jdbc:hive2://hive2.acme.com:10000/acme_db |
Kerberos Config * | Path to the Kerberos configuration file (typically krb5.conf ). | User needs to upload the config file. |
Use SSL | Enables encrypted communication between iceDQ and Hive. Refer to the SSL section below for setup instructions. | true/false |
Service Principal | The unique Kerberos identity of the Hive service (format: service/host@REALM ). Used to request service tickets. | [email protected] |
Host * | IP address or hostname of the Hive server. | hive2.acme.com |
Port * | Port on which the HiveServer2 service listens. Default is 10000. | 10000 |
Database * | Name of the Hive database to connect to. | acme_db |
Type * | The connection type, either System Connection or User Connection. Refer to Connections for more details. | System or User |
User Principal * | Kerberos principal name of the user (e.g., user@REALM). | [email protected] |
User Keytab * | Path to the keytab file containing encrypted credentials for the user principal. | User needs to upload the Keytab file. |
Kerberos Ticket Cache Authentication
Use the following properties to create a valid connection. Properties marked with an asterisk (*) are required.
Name | Description | Example Values |
---|---|---|
Connection Name * | A unique name that identifies the connection. | Hive_UAT_conn |
Driver * | Driver used to establish the connection. By default, one driver is available. | Cloudera Hive Native JDBC |
Custom JDBC URL | Standardized string used to define the connection details. Use this format supported by the driver: jdbc:hive2://[host]:[port]/[database] | jdbc:hive2://hive2.acme.com:10000/acme_db |
Kerberos Config * | Path to the Kerberos configuration file (typically krb5.conf ). | User needs to upload the config file. |
Use SSL | Enables encrypted communication between iceDQ and Hive. Refer to the SSL section below for setup instructions. | true/false |
Service Principal | The unique Kerberos identity of the Hive service (format: service/host@REALM ). Used to request service tickets. | [email protected] |
Host * | IP address or hostname of the Hive server. | hive2.acme.com |
Port * | Port on which the HiveServer2 service listens. Default is 10000. | 10000 |
Database * | Name of the Hive database to connect to. | acme_db |
Type * | The connection type, either System Connection or User Connection. Refer to Connections for more details. | System or User |
User Principal * | Kerberos principal name of the user (e.g., user@REALM). | [email protected] |
Ticket Cache File Name * | Path to the Kerberos ticket cache. | User needs to upload the Ticket cache file. |
SSL Configuration
The Secure Sockets Layer (SSL) option enables encrypted communication between iceDQ and the Hive server. iceDQ supports SSL through a Java Truststore. Use the following properties to configure SSL. Properties marked with an asterisk (*) are required.
Name | Description | Example Values |
---|---|---|
Java Truststore | A valid .jks file (Java Keystore) containing trusted SSL certificates. | User needs to upload the Truststore file. |
Java Truststore Password | Password used to access the Java Truststore. Optional if the truststore does not require one. | my_password |
JAAS Properties
With respect to Kerberos authentication, JAAS properties refer to configuration settings used by the Java Authentication and Authorization Service (JAAS) to authenticate a user or application using Kerberos. For a comprehensive list of supported properties, refer to the JAAS Reference section of the Cloudera documentation.
Custom Properties
Custom properties are optional connection parameters in the Hive driver that allow customization of settings such as timeouts and proxy configurations. A list of supported properties is available here. The availability and behavior of custom connection properties may vary depending on the Hive JDBC driver version in use.
Unsupported Datatypes
The following datatype is not supported:
- BINARY
Note: Complex datatypes such as ARRAY, MAP etc. are read as TEXT datatypes.