To create a task for MongoDB database, go to NoSQL > MongoDB > Tasks. This will display the Task Definitions screen. To create a new task, click the Add New Task Definition tab. The following screenshot depicts the user interface for creating a task:
Enter a unique task name in the Task Name field. This field supports numeric and character values.
Enter a Task Description of maximum 254 characters. This field supports numeric and character values.
Select the attribute name from the Task Attribute drop-down. This option allows to add tags to the created task.
Enter a numeric value for Duplicate Damping Factor. It is used to specify the maximum hit count for any given type of sensitive type.
*Note: Task Attribute field is not applicable for NoSQL tasks.
PK Protect is equipped with data sampling to limit the area of scan which helps in reducing the time taken for detection. By default, there are two options to scan sample data from the documents are:
Random 1000 rows/documents - It will sample approximate 1000 rows from the data, randomly.
Read random 5% of data – It will sample 5 percent of the data, randomly.
Select the Sampling Configuration from the drop-down.
By default, Random 1000 rows/documents option is selected. To define a sampling configuration, go to NoSQL > MongoDB > Tasks > Sampling Configuration tab. Click + Add button. To create a sample in the Add New Task Definition screen, click the +Add button next to the Sampling Configuration drop-down.
The Sampling Configuration screen is depicted below:
Enter the name and description of the sampling configuration in the Name and Description textbox.
Check the option Set Sampling Config as Default to set the Sampling Configuration as the default configuration for all your tasks.
By: To Specify how to pick data for sampling from the database, there are two ways:
Rows/Documents: Select Rows/Documents from the drop-down, to sample data based on the number of rows/documents.
Percent: Select Percent from the drop-down, to sample a percentage of the data.
Value: Enter the numeric value. It will specify the total number of records to be processed if sampling By-Rows is selected and denotes the percentage of sampling By-Percent is selected.
Type: Select the sampling configuration type from the Type option.
Random: This option in the Type field scans random entries in the database. It will scan the number of entries based on the value entered in the Value field.
Complete: Select Complete to use the complete data for sampling.
After setting up the required configuration, click Add to add the user-defined sampling configuration to the list. Click the Save button to save the changes.
The Select Connections lists down all the available connections. Any number of connections can be selected for a task. You can also create a new connection by clicking the +Add New Connection button. To know more about how to create and manage connections, refer to section Connection Manager.
To select a connection, check the checkbox available with the connection name. The list of connections can be segregated based on the group values specified in the Select Group drop-down. There are five options based on which connections can be sorted.
Connection IDP: Categorizes the available connections based on the types of IDPs available, i.e., Detection and Masking.
Connection Type: Categorizes the available connections based on the type of server connected to, i.e., Oracle, Teradata, SQL server etc.
Host Name: Categorizes the list of available connections based on Host Names.
Location: Categorizes the available connections based on the location of the target source system server, i.e., On-Premises and Cloud.
User Name: Categorizes the list of available connections based on the Usernames.
The Select Group Value drop-down displays the values based on the selection made in the Select Group drop-down. The panel gets populated as per the selection made in the Select Group drop-down.
Check the checkbox available with the connection name that you want to select, only then the Test and Database Object Filters options get enabled.
The Test button lets you to test the connection before executing the task. It will show the pop-up on successful completion of testing.
Click the Database Object Filter to filter tables and/or columns. Once filters are defined, then only those databases/tables/columns that match the filter are scanned.
Select the connection from the Connection List or search the DB/Schemas name in the Filter by Schema/DB name textbox. This panel displays the list of all available connections. To refresh this section, click Refresh button next to the textbox. This will update the information.
Apply the filter in the right section of the panel by specifying the Operator, Collection Operator and Collection Filter name. Click the + (add) sign next to the collection filter drop-down to add the filter in the Selected Filters.
The Selected Filters displays the list of all the filters which have been recently added. To edit a filter detail, click Pen icon in the Edit column. To delete the filter, check the checkbox next to the edit column and click Trash button.
There are eight types of Operators based on which you can select the collection name.
Equals: This operator will check whether the given collection name exists in the selected connection. It will return the matched records if the condition is fulfilled.
Not Equal to: This operator will return all the records except the given collection name.
Contains: This operator will return only those collections which name contains the given condition.
Does not contain: The functionality of this operator is like the Not Equal to operator since it returns all the collections which do not contain the given collection name.
Starts with: This operator will return all the collection names which name starts with the given condition.
Does not start with: The functionality of this operator is like the Does not contain and Not Equal to, since it will return all the collection name which does not start with the given condition.
Ends with: This operator will return all the tables/column name whose name ends with the given input.
Does not end with: The functionality of this operator is like the Does not contain and Not Equal to, since it will return all the tables/column name except the one which has been entered.
To test a filter, check the checkbox for the selected filter next to the Edit column. Click the Test button in the Filtered Data section. The Filtered Data section displays the result for the filter applied.
Click Save button to make the changes effective else click Cancel.
The Select Policy panel displays all the Pre-Defined and Customized Policies. You can select any number of policies while creating or editing a task. Sensitive types associated with the selected policy can be viewed in the Select Sensitive Data Types panel. Selecting a policy is not a mandatory step, users can also proceed to select individual sensitive types. You can also add a new policy by clicking the +Add Policy button. To know more, refer to section Policy.
The Select Sensitive Data Types panel display the list of all Pre-Defined and Custom Sensitive Types. The sensitive type associated with the policy gets selected in the Pre-Defined and Custom Sensitive panel and cannot be removed once selected, however any number of sensitive types can be added to the panel. You can also add a new sensitive type by clicking the +Add New Sensitive Data Type button. To know more, refer to section Sensitive Types.
Click Save button to save the task. To execute the task instantly, click Save And Execute button.
To edit an existing task, select the required task from the Task panel on the Tasks Definitions screen. Click this Pen icon in the Action column.