Skip to main content

Appendix A: Verifying Hadoop Results

Verify that Hadoop tasks are finding targeted sensitive data by comparing the Task Results file or Detailed Results file with the file on which the task was run.

Task Results files can be found at: hadoop fs -cat /dataguise$/results/(taskname)/(task_instance_ID)

Detailed Results files can be found at: hadoop fs –cat/dataguise$/results/(taskname)/(task_instance_ID)/summary_results_structured/

The specific contents of the task results file differ slightly according to file type, and whether the file is treated as structured or unstructured.

Task Results files show the type and location of the discovered sensitive element. However, the discovered sensitive data element’s actual vale can be suppressed by setting the results.suppressRealValues property in the HDFSIDPConfig.properties file. For instance, if PK Protect discovers the email address abc@xyz.com when this property is turned on, abc@xyz.com will not appear in the results file. To know more about HDFS IDP in linux environment, refer to HDFS IDP

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.