Skip to main content

Hive IDP

There are two configurable properties associated with the Hive IDP. The first is the IDPs configuration file, The second is the IDP’s web service properties file. After making any changes to either file, restart the Jetty web service.

/etc/init.d/HiveIDP start/stop

File Path: {Installation Path}/DgSecure/IDPs/HiveIDP/

Important Hive parameters can be set in the file. Following are the properties defined in the file which the user may need to configure:

  1. Listening port: Designates the port on which PK Protect should listen for Hive. Default port is 9980.

  2. Job status: Specifies which users have access to Hadoop job history.

  3. Keytab file location: This property is only used with Kerberos. It points to the location of the keytab file on your local system.

  4. Kerberos binaries location: If kinit or klist are not available on system PATH, use this property to specify the path to Kerberos bin folder that includes the kinit and klist executables.

  5. Hive IDP directory: Absolute path of the directory in HDFS to be used by the Hive IDP as a data store. The user running the job should have write access to the designated path.

  6. UDF jar path: This property lists the absolute path of the directory containing the UDF jars on the Hive server machine. These jars must be accessible by Hive.

  7. Jars required on classpath: Comma separated list of jars required on classpath.

  8. JDBC URL: The URL for Hive’s JDBC

  9. Complex type discovery: Enables complex discovery in Hive. This flag can be set to “true” or False”. When set to true, complex type discovery is enabled for Hive tasks.

  10. Map reduce debug logs: Enables debug logging for map reduce. This flag can be set to “true” or False”. When set to true, map reduce debug logging is enabled for Hive tasks.

  11. Key management
    dg.keystore.pass: keystore password
    dg.keystore.alias: alternate name for keystore
    dg.keystore.abspath: The path of the keystore file, as defined in the code.
    dg.keystore.type: The type of the keystore file, as defined in the code.

  12. Sentry/Ranger enabled: To run PK, Protect Hive task on a cluster protected by Sentry or Ranger, this property must be set to “true”. The property is set to “false” by default.

  13. Atlas Parameters: atlas.url: Atla’s URL Name of Atlas cluster. If unknown, name can be found with the following command. curl --user <ambariUsername>:<ambariPassword> Error! Hyperlink reference not valid. Adress>:<Port>/<cluster path>

    dg.access.control: Controls whether Atlas processes Hive discovery results. When set to “NONE”, Atlas does not process the results. When set to “Atlas”, Atlas processes the results. The default value of this property value is “NONE”.

  14. DGUSER: This property is used for setting up the HIVE Impersonation. To enable this property, set its value True and restart the agent. By default, the value of this property is False.

    If this property is enabled, edit dg.ugi to HDFS and for Kerberos environment set dg.ugi.keytab to HDFS principal. Also, set the HDFS absolute path of the HDFS keytab file in Hive agent’s files.

  15. Configuration directory paths: dg.hdp.conf.path: Sets the path for the directory where yarn-site.xml, mapred-site.xml, hdfs-site.xml and core-site.xml are located.
    dg.hive.conf.path: Sets the path for the directory where hive-site.xml is located.

  16. Print sensitive data: This group of properties controls the sensitive data results. The results files will be written to the following path
    {Installation Path}<hive_IDPfolder>/detailedResults/<taskName>/ <taskInstanceId>/<tableName>/<columnName>

The number of results files for a column is directly proportional to the number of mappers run for the task. For instance, if 10 mappers run for a task, then for each column, 10 timestamp files will be created.

  1. showSensitiveData: Controls whether sensitive data is displayed in the results file. Can be set to false or true.
    e.g., showSensitiveData = false 

  2. percentSensitiveData: When showSensitiveData is set to “true”, this property controls the percentage of sensitive data to display.

    For primitive column types, setting this property to 10 means that 10% of any column with sensitive data is displayed (every 10th row). For Complex column types, only rows with sensitive data will appear in the printout.

    Example: percentSensitiveData = 10

  3. sensitiveChunkSize:  Determines the size of the chunks in the results file. By default, this property is set to 50.
    Example: sensitiveChunkSize = 50

Use this property file to change the port number.
File path: /opt/Dataguise/DgSecure/IDPs/HiveIDP/

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.