Skip to main content

Hive


Apache Hive is a data warehouse system built on top of Hadoop that provides a SQL-like query language (HiveQL) to analyze and manage large datasets stored in HDFS or compatible storage systems. It simplifies big data processing by allowing users to write queries instead of complex MapReduce programs.


Prerequisites

The following prerequisites must be met to create and test a successful connection.

  • The Hive server must be accessible from iceDQ server.
  • Valid credentials to access the database.
  • Hive Cloudera JDBC version 4 or above.

Authentication Mechanisms

The following authentication mechanisms are supported.

  • Username & Password
  • Kerberos
  • Kerberos Ticket Cache

Connection Properties

Username and Password Authentication

Use the following properties to create a valid connection. Properties marked with an asterisk (*) are required.

NameDescriptionExample Values
Connection Name *       A unique name that identifies the connection.Hive_UAT_conn
Driver *Driver used to establish the connection. One driver is available by default.Cloudera Hive Native JDBC
Custom JDBC URLStandardized string used to define the connection details. Use this format supported by the driver: jdbc:hive2://[host]:[port]/[database]jdbc:hive2://hive2.acme.com:10000/acme_db
Use SSLEnables encrypted communication between iceDQ and Hive. Refer to the SSL section below for setup instructions.true/false
Host *IP address or hostname of the Hive server.hive2.acme.com
Port *Port on which the Hive server listens. Default is 10000.10000
Database *Name of the Hive database.acme_db
Type *The connection type, either System Connection or User Connection. Refer to Connections for more details.System or User
Username *Hive login username with necessary privileges.john_doe
Password *Password associated with the specified username.Admin@123

Kerberos Authentication

Use the following properties to create a valid connection. Properties marked with an asterisk (*) are required.

NameDescriptionExample Values
Connection Name *       A unique name that identifies the connection.Hive_UAT_conn
Driver *Driver used to establish the connection. By default, one driver is available.Cloudera Hive Native JDBC
Custom JDBC URLStandardized string used to define the connection details. Use this format supported by the driver: jdbc:hive2://[host]:[port]/[database]jdbc:hive2://hive2.acme.com:10000/acme_db
Kerberos Config *Path to the Kerberos configuration file (typically krb5.conf).User needs to upload the config file.
Use SSLEnables encrypted communication between iceDQ and Hive. Refer to the SSL section below for setup instructions.true/false
Service PrincipalThe unique Kerberos identity of the Hive service (format: service/host@REALM). Used to request service tickets.[email protected]
Host *IP address or hostname of the Hive server.hive2.acme.com
Port *Port on which the HiveServer2 service listens. Default is 10000.10000
Database *Name of the Hive database to connect to.acme_db
Type *The connection type, either System Connection or User Connection. Refer to Connections for more details.System or User
User Principal *Kerberos principal name of the user (e.g., user@REALM).[email protected]
User Keytab *Path to the keytab file containing encrypted credentials for the user principal.User needs to upload the Keytab file.

Kerberos Ticket Cache Authentication

Use the following properties to create a valid connection. Properties marked with an asterisk (*) are required.

NameDescriptionExample Values
Connection Name *       A unique name that identifies the connection.Hive_UAT_conn
Driver *Driver used to establish the connection. By default, one driver is available.Cloudera Hive Native JDBC
Custom JDBC URLStandardized string used to define the connection details. Use this format supported by the driver: jdbc:hive2://[host]:[port]/[database]jdbc:hive2://hive2.acme.com:10000/acme_db
Kerberos Config *Path to the Kerberos configuration file (typically krb5.conf).User needs to upload the config file.
Use SSLEnables encrypted communication between iceDQ and Hive. Refer to the SSL section below for setup instructions.true/false
Service PrincipalThe unique Kerberos identity of the Hive service (format: service/host@REALM). Used to request service tickets.[email protected]
Host *IP address or hostname of the Hive server.hive2.acme.com
Port *Port on which the HiveServer2 service listens. Default is 10000.10000
Database *Name of the Hive database to connect to.acme_db
Type *The connection type, either System Connection or User Connection. Refer to Connections for more details.System or User
User Principal *Kerberos principal name of the user (e.g., user@REALM).[email protected]
Ticket Cache File Name *Path to the Kerberos ticket cache.User needs to upload the Ticket cache file.

SSL Configuration

The Secure Sockets Layer (SSL) option enables encrypted communication between iceDQ and the Hive server. iceDQ supports SSL through a Java Truststore. Use the following properties to configure SSL. Properties marked with an asterisk (*) are required.

NameDescriptionExample Values
Java TruststoreA valid .jks file (Java Keystore) containing trusted SSL certificates.User needs to upload the Truststore file.
Java Truststore PasswordPassword used to access the Java Truststore. Optional if the truststore does not require one.my_password

JAAS Properties

With respect to Kerberos authentication, JAAS properties refer to configuration settings used by the Java Authentication and Authorization Service (JAAS) to authenticate a user or application using Kerberos. For a comprehensive list of supported properties, refer to the JAAS Reference section of the Cloudera documentation.


Custom Properties

Custom properties are optional connection parameters in the Hive driver that allow customization of settings such as timeouts and proxy configurations. A list of supported properties is available here. The availability and behavior of custom connection properties may vary depending on the Hive JDBC driver version in use.


Unsupported Datatypes

The following datatype is not supported:

  • BINARY

Note: Complex datatypes such as ARRAY, MAP etc. are read as TEXT datatypes.