1. Download the JDBC drivers
The very first step is of course to download the latest Cloudera JDBC drivers from their website. Extract the downloaded zip file and you will see two other zip files containing the JDBC JAR files - ImpalaJDBC4.jar and ImpalaJDBC41.jar.
For this blog, I am using version 2.6.4 and ImpalaJDBC41.jar driver.
2. TLS Certificate Truststore
If you cluster has TLS enabled, you will need to have the root and intermediate CA certificates in Java keystore format as your truststore.
3. Create a New Connection in DBeaver
Start DBeaver on your laptop and create a new connection:
Navigate to Hadoop and select Cloudera Impala:
Enter the following information as shown in the following screenshot:
- Host - hostname of the Impala daemon (coordinator), or the load balancer if there is one. The hostname has to be fully qualified (FQDN) for Kerberos to work.
- Database/Schema - the name of the Impala database to connect to.
We also need to configure the JDBC connection string to include the Kerberos and TLS properties. Click on "Connection Properties" and add the properties as shown in the screenshot below:
- AuthMech - the authentication mechanism to use. Value of 1 indicates Kerberos.
- KrbHostFQDN - the hostname in FQDN that you are connecting to. This should be the FQDN of the load balancer if you are connecting via the load balancer.
- KrbRealm - Kerberos realm of the Cloudera cluster.
- KrbServiceName - the Kerberos service name. In this case it is impala.
- SSL - 1 if the Cloudera cluster has SSL/TLS enabled.
- SSLTrustStore - the path to the truststore in Java Keystore format that contains the root CA and any intermediate CA certificates (if any).
- SSLTrustStorePwd - the truststore password.
If you have gotten the properties right, click on the "Test Connection" and you should see that DBeaver is connected successfully. Remember that you need to kinit first to get a valid Kerberos ticket.
4. Troubleshooting
If for any reason you are having issues connecting with the JDBC driver, you can add the following properties to enable logging in the driver:
- LogLevel - set the value to 6.
- LogPath - the path to the directory where the driver will write the logs to.
A new log files will be created for each connection attempt with the filename Impala_connection_XX.log, where XX is an incrementing number.