Skip to main content

Configure PK Protect for Azure Data Lake Storage

For Azure Data Lake Storage (ADLS) scanning, PK Protect requires the Cloud IDP to be installed and associated with the HDInsight HDFS IDP.

Change the ControllerUrl from “localhost” (which is set as default) to the IP of the controller. There is one file that will need configuration changes at the following location:

File path: /{installation path}/Dataguise/DgSecure/IDPs/CloudIDP/

Filename: azure-credfile.properties

CODE
--------------------
# Azure Subscription ID (e.g b5*****e-5**6-4**e-b**1-678******957)
subscription=
# The Application ID (e.g b6****fc-8**a-4**c-9**e-aeb*******d2)
client=
# Azure Tenant ID
tenant=
managementURI=https\://management.core.windows.net/
baseURL=https\://management.azure.com/
authURL=https\://login.windows.net/
graphURL=https\://graph.windows.net/

##########For using certificate for authentication##########
# The path of the certificate file.
certificate=
# The certificate password.
# Uncomment if certificate needs a password. This is optional when using a .pem certificate.
#certificatePassword=
############################################################


##########For using key for authentication##########
# The secret key associated with the application. Not required when using a certificate.
# key=
####################################################
--------------------

These values should be taken from the Microsoft Azure Portal.

Certificate and Certificate Password/Key are other two details that need to be edited. These properties are cluster/application specific and an Admin or the IT team would provide you with this information.

If the certificate needs a password, uncomment “certificatePassword=” and add the password. This is optional when using a.pem certificate.

Once these changes are completed, changes the permissions to 400 for “azure-credfile.properties”.

Configure HDInsight IDP:

  1. Verify that the correct HDP version is added in the “jetty-embedded.properties” at “/{installation path}/Dataguise/DgSecure/IDPs/HDFSIDP/”for property “JavaOptions-Dhdp.version=”

    *Note: This can be found by running “Hadoop fs -ls /hdp/apps”.

  2. Verify the properties in the “HDFSIDPConfig.properties” file at ““/{installation path}/Dataguise/DgSecure/IDPs/HDFSIDP/expandedArchive/WEB-INF/classes/com/Dataguise/Hadoop/util/” are correct for the user and the various other properties.

  3. After all changes are made, restart the HDFS IDP, as follows: /etc/init.d/DgSecureHDFSIDP. Once all the IDP is restarted, log into the Admin page. Go to IDP Management->IDPs->Add IDP screen. Here we will add the Azure Cloud IDP and the Azure Data IDP.

  4. Then go to the Manage Clusters page and Add Cluster.

  5. Once the cluster has been added, go to the PK Protect page and create the task as desired and execute.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.