Skip to main content

Consistent Masking

This section covers consistent masking. Either Couchbase or Hbase is used for consistent masking in HDFS. While sharing some details with the previous IDP specific sections, it does offer more details on both Couchbase and HBase installation.

Setup Couchbase on Linux

If you want duplicate instances of the same data in the same column masked identically, you must use Hadoop masking domains to apply consistent masking. If you are not using consistent masking, you do not need to setup Couchbase (or its alternative, HBase).

To install Couchbase

  1. Login to the server on which you want to install Couchbase using an administrator account with root (Superuser) privileges.

  2. Enter the following command:
    rpm --install couchbase-server <version>.rpm

    Where: <version> is the version number of the downloaded package. Once the rpm command is executed, the Couchbase server starts automatically. For more information, see the Couchbase documentation.

To configure PK Protect to connect to Couchbase:

  1. Login to the server on which you installed the DSM Administrator using an administrator account with root (Superuser) privileges.

  2. In a separate window, navigate to the DgSecure folder. The default root path is:
    /opt/Dataguise/DgSecure

  3. Navigate to the HDFSIDPConfig file, located in the util folder found at this path:
    <rootpath>/tomcat9/webapps/HDFSIDP_Build/WEB-INF/classes/com/dataguise/hadoop/util/

  4. Edit the HDFSIDPConfig.properities and modify the following properties:

    1. For a single-node Couchbase server, cmservers properties should be defined as:
      cmservers=<ip_address1/hostname1>:<8091>

    2. For a multi-node Couchbase server, cmservers properties should be defined as:
      cmservers=<ip_address/hostname>:<port>, <ip_address2/hostname2>:<port>

      Where:<host_IP> is the IP address of the Couchbase host machineSave your changes and exit.

Setup HBase on Linux

*Note: If you want duplicate instances of the same data in the same column masked identically, you must use Hadoop masking domains to apply consistent masking. If you are not using consistent masking, you do not need to setup HBase (or its alternative, Couchbase).

To install HBase

  1. Log into the server on which you want to install HBase using an administrator account with root (Superuser) privileges.

  2. Enter the following command:
    rpm --install HBase-server <version>.rpm
    Where: <version> is the version number of the downloaded package.

Once the rpm command is executed, the HBase server starts automatically. For more information, see the HBase documentation.

To configure PK Protect to connect to HBase

  1. On the PK Protect side:

    1. Log into the server on which you installed the DSM Administrator using an administrator account with root (Superuser) privileges.
      In a separate window, navigate to the DgSecure folder. The default root path is: /opt/Dataguise/DgSecure

    2. Navigate to the HDFSIDPConfig file, located in the util folder on this path
      <rootpath>/tomcat9/webapps/HDFSIDP_Build/WEB-INF/classes/com/dataguise/hadoop/util/

    3. Edit HDFSIDPConfig.properities and modify the following properties:

      1. For a HBase server, cmservers properties should be defined as:
        msengine=hbase
        hbase.zk.quorum=<IIP address of ZK quorum
        hbase.zk.property.clientPort=<Port address of ZK>

      2. On the HBase side:

        1. SSH into one of the cluster machines.

        2. Run command hbase shell, this opens the hbase prompt.

      3. On the HBase prompt, run list, this lists all the tables in HBase.

        1. If dg_cm does not exist, create dg_cm table with column family msk using the command: create ‘dg_cm’,‘msk’  

        2. If dg_cm already exists, you might want to remove all data using the command truncate “dg_cm” (This depends on the type of test you want to run)

    4. Add coprocessor by altering the table:

      1. Run disable “dg_cm”

      2. Run

      3. Alter “dg_cm”

      4. Method => “table_att”

      5. Coprocessor => “hdfs:///<Abs path to coprocessor
        jar>|com.dataguise.coprocessor.ConsistentMaskingCoprocessor|1001|family=msk,col=val,hide_addresses=y”

      6. Run describe “dg_cm” to ensure that the table was created and altered correctly.

      7. Run enable “dg_cm”

*Note: Use apt filesystem scheme depending upon the source of the coprocessor jar and the distribution. Ex. maprfs:/// for mapr, s3:/// for s3 etc.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.