Skip to main content

Hiveserver2 with Beeline

Currently, DgDecrypter 5.2 API is supported for use with Hortonwork 2.1, Pivotal 3.0, and CDH 5.4 & 5.5. While deployment is similar, there are slight differences for successful deployment in each environment.

For clarity, the configuration instructions for each distribution platform are in covered separately. Section 2.1 covers Hortonworks configurations, Section 2.2 covers configurations for Cloudera, and Section 2.3 covers configurations for Pivotal.

Before discussing platform specific configurations, we will first cover available UDFs, pre-hooks, and general assumptions this document makes concerning Hive decryption.

If you are using Hive CLI or HUE, please contact PKWARE support.

Hive UDFs

PKWARE hive user defined functions are functions to perform decryption in MapReduce or HiveDriver (local mode).

Hive UDFs

Description

com.dataguise.decrypterlib.hive.udf.DgDecrypter

Custom UDF called in hive query

com.dataguise.decrypterlib.hive.udf.DgDecrypterLocal

Custom UDF called in hive query which runs in local Mode (no hooks needed)

com.dataguise.decrypterlib.hive.udf.DgDecrypter6

Custom UDF requiring no high-level steps or pre-hooks

Hive Pre-Hooks

PKWARE hive pre-hooks are hooks which runs in HiveDriver before executing hive query or create functions.

Hive UDFs

Description

com.dataguise.decrypterlib.hive.udf.DgHivePreHook

Read dgSecure.properties and generate the parameters for Hive UDFs

DgDecrypter6 Configurations

In PK Protect, open HDFSIDP.properties file: {Installation Path}/DgSecure/IDPs/HDFSIDP /expandedArchive/WEB-INF/classes/com/dataguise/hadoop/util/HDFSIDPConfig.properties:

  1. Enable decrypter use by the IDP. Set the property OTF.decryption.configuration to “Y”. To use this decrypter, this property must set to “Y” before initial encryption takes place.
    OTF.decryption.configuration=Y

  2. Specify a folder decryption metadata: OTF.decryption.metadata.path=/tmp/dataguise$

  3. Run an encryption task, and load encrypted file into Hive

  4. Login to hiveserver2, create a permanent function:
    Create Function dg_permanent AS 'com.dataguise.decrypterlib.hive.udf.DgDecrypter6' using jar 'hdfs:///tmp/dataguise$/DgDecrypter.jar’.

  5. Run queries.
    Select dg_permanent(Column) from Table;

Limitations:

  1. This new UDF is only applied to PKWARE default keystore only. For safenet, kmip, RSA, please use other UDFs.

  2. Hive is configured to the same cluster which PK Protect encryption task is running on.

Assumptions:

  1. User can successfully connect Hiveserver2 from Beeline

  2. User can successfully execute Hive queries from Beeline.

  3. User has added the jar/Set HivePreHook/Create function in Ambari or Cloudera manager.

HW-2.1

  1. Configuration

    1. If the enable.bs.provider property is set to “true”, copy Bouncycastle jars (bcpg-jdk15on-150.jar, bcprov-ext-jdk15on-150.jar) to Ambari/java/jre/lib/ext on the hiveserver2 machine. If the property is set to “false” (i.e., if you are using Java keystore instead of Bouncycastle), do nothing.

    2. Log in from Ambari and do the following:

      1. Disable ‘hive.security.authorization.enabled.’

      2. Under Custom hive-site.xml, click on the Add Property button and add ‘hive.aux.jars.path’ property with the value of you DgDecrypter.jar’s location.

      3. Under Custom hive-site.xml, click on the Add Property button and add.

        hive.security.authorization.sqlstd.confwhitelist.append with value as

        ‘dg\..*|hive\.vectorized\.execution\.enabled|hive\.exec\.pre\.hooks’

    3. Save your configuration changes at the top of this page in Ambari.

    4. Restart all hive services from bari.

    5. Using the Beeline client, connect to the Hiveserver2 and execute the following:

      1. Show functions; (Make sure “dgdecrypter” function is listed)

      2. Add jar <JarPath/DgDecrypter.jar>; (If this command fails, try running select query straightaway)

      3. Set hive.exec.pre.hooks = com.dataguise.decrypterlib.hive.udf.DgHivePreHook;

      4. Create Function DgDecrypter AS com.dataguise.decrypterlib.hive.udf.DgDecrypter';

      5. Use Default;

      6. Select DgDecrypter(<ColumnName>) from <TableName>;

      7. Sample queries

        CODE
        Create Table Decrypted_Table_1 
        Row Format Delimited
        Fields Terminated By ','
        Stored As Textfile Location '/Results-Decryption'
        As Select DgDecrypter(CCNDigits),DgDecrypter(CCNSpace),DgDecrypter(CCNDash),
        DgDecrypter(SSNDigits),DgDecrypter(SSNSpace),DgDecrypter(SSNDash),DgDecrypter(TelephoneDigits),
        DgDecrypter(TelephoneSpace),DgDecrypter(TelephoneDash),DgDecrypter(TelephoneStd),
        DgDecrypter(TelephoneStdWithoutSpace),DgDecrypter(Email),DgDecrypter(Names),
        DgDecrypter(NPI),DgDecrypter(URL),DgDecrypter(AddressLine1),DgDecrypter(City),
        DgDecrypter(State),DgDecrypter(Zip),DgDecrypter(IPAddress),DgDecrypter(CM11),
        DgDecrypter(CM13),DgDecrypter(DataTime),DgDecrypter(SwipeData),DgDecrypter(InternationalTelephoneNumbers),
        DgDecrypter(BankNumber)  From Temp;

CDH5.4

  1. Configuration

    1. Copy bouncy castle jars to the cloudera manager\java\jre\lib\ext path used by Hadoop/Hiveserver2

    2. Restart all of hive related services from Cloudera Manager

    3. From Beeline, run following commands sequentially: -

      1. add jar /digvijay/beeline_otf /DgDecrypter.jar;

      2. sethive.security.authorization.sqlstd.confwhitelist.append=dg\..*|hive\.vectorized\.execution\.enabled|hive\.exec\.pre\.hooks;

      3. Set hive.exec.pre.hooks=com.dataguise.decrypterlib.hive.udf.DgHivePreHook;

        **Steps B & C can also be configured from Ambari hive-env.sh, hive-site.xml

        In Cloudera Manager hive-env.sh:

        Add HIVE_AUX_JARS_PATH=/path_to_jar/

    4. In Cloudera Manager hive-site.xml: click Add

  2. Sample Query
    Create Temporary Function DgDecrypter AS 'com.dataguise.decrypterlib.hive.udf.DgDecrypter'

    *//this command creates params.properties file under /dataguise$/tmp location. If this does not get created then our pre-hook is not executed successfully, hence check the hiveserver2.log for errors.


    SELECT CCNOdigits,DgDecrypter(CCNOdigits) from beeline_dig12;

Pivotal 3.0

  1. Configuration

    1. Copy bouncy castle jars to the ambari\java\jre\lib\ext path used by Hadoop/Hiveserver2. Typically, this path is found at /usr/phd/3.x/hive/lib (for Pivotal 3.0) on the hiveserver2 box.

      *Note: Make sure to remove the pervious bouncy castle and DgDecrypter.jar files from all their cluster nodes.

    2. Restart all of hive related services from ambari

    3. Run these commands sequentially:

      1. add jar /digvijay/beeline_otf /DgDecrypter.jar;

      2. set hive.security.authorization.sqlstd.confwhitelist.append=dg\..*|hive\.vectorized\.execution\.enabled|hive\.exec\.pre\.hooks;

      3. Set hive.exec.pre.hooks=com.dataguise.decrypterlib.hive.udf.DgHivePreHook;

  2. Sample Query
    Create Temporary Function DgDecrypter AS 'com.dataguise.decrypterlib.hive.udf.DgDecrypter'

    *//this command creates params.properties file under /dataguise$/tmp location. If this does not get created then our pre-hook is not executed successfully, hence check the hiveserver2.log for errors.

    SELECT CCNOdigits,DgDecrypter(CCNOdigits) from beeline_dig12;

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.