Skip to main content

Detection - RDBMS

To access the RDBMS Task definition screen, click RDBMS > Detection > Tasks. This displays the Task Definitions screen.

Select RDBMS in the Select Module drop-down.

To create a new task, click the Add New Task Definition tab in the Task Definitions screen. The following screenshot shows the user interface for creating a task:

  1. Enter a unique name for the task in the Task Name field. This field supports numeric and character values.

  2. Enter a description for the task in the Task Description field. This accepts maximum 254 characters. It also supports numeric and character values.

  3. Select the attribute name from the Task Attribute drop-down. This option allows to add tags to the created task.

     

  4. Select the task type from the Select Task Type drop-down. It displays two options i.e., Detection and Metadata Discovery.

     

    1. Detection task type scans the database for the sensitive data selected under the Policy screen. This option is supported by all types of data sources.

    2. Metadata Discovery scans your database to provide information about the type of data available on the database. This option is supported by Oracle, Teradata, and SQL Server databases.

      *Note: Metadata Discovery task type is not supported for RDS and Azure modules.

  5. PK Protect is equipped with data sampling to limit the area of scan which helps in reducing the time taken for detection. By default, there are two options to scan the database are:

    1. Top 1000 Rows – It will sample approximate 1000 rows from the top of the database.

    2. Read top 5% of data – It will sample 5 percent of the data from the top of the database.

      By default, Top 1000 Rows option is selected. To create a sample, go to RDBMS > Tasks > Sampling Configuration tab. Click + Add Configuration button. You can also create a sample by clicking Add button next to the Sampling Configuration drop-down on Add New Task Definition screen.

       

      To create a sample, perform the below steps:

       

      1. Enter the name and description of the sampling configuration in the Name and Description field. This field accepts numerals, characters, and special characters.

      2. Check the option Set Sampling Config as Default to set the Sampling Configuration as the default configuration for all the tasks.

      3. Check the option Show Advance Sampling Details to set the advanced settings for sampling. Below are the options for advanced settings: 

        1. Table row count range: Enter numeric value. This value states the starting range of the table from which the records will be sampled.

        2. To: Enter the numeric value. The value in this field states the ending range of the table till which the records will be sampled.

        3. Type: Select the sampling configuration type from the Type option. There are four options for sampling configuration:

          1. Top: If you select the option Top, the sample data for the scan will be selected from the entries at the top of the table, based on the specified range.

          2. Bottom: If you select the option Bottom, the sample data for the scan will be selected from the entries at the bottom of the table. This does not mean the entries will be selected bottom up, instead depending on the range the last entries in the table will be taken to create a sample data for detection.

          3. Random: This option in the Type field scans random entries in the database. It will scan the number of entries based on the value entered in the Value field.

          4. Complete: If all the entries for the selected tables of the database must be scanned for sensitive types, select this option.

          5. By: To Specify how to pick data for sampling from the table, there are two ways:

            1. By Rows: Select this option to sample data based on the number of rows.

            2. By Percent: Select this option to sample data based on a percentage of the data. 

          6. Value: Enter the numeric value. It will specify the total number of records to be processed if sampling By Rows is selected and denotes the percentage if sampling By Percent is selected.

        4. After setting up the required configuration, click Add button to add the user-defined sampling configuration to the list. Click Save button to make the changes effective, else click Cancel.

  6. The + Show Advanced Options display the advance setting of the task. To view the setting, click this button.

     

    On clicking Show Advanced Options, below options are displayed:

     

    1. The Thread Count specifies the number of threads to launch while executing the detection task.

    2. Select the connection limit from the Select Connection Limit drop-down. This enables the user to define whether to use single JDBC connection to scan all tables in a database or each table should be using its individual connection. This will display two options: 

      1. Single Connection: If the user selects this option, then during the execution of DBMS detection task only one single connection to the database will be made.

      2. Connection Per Thread: If the user selects this option, then there will be as many database connections as there will be task threads during the task execution i.e., one database connection will be made per table scan thread.

    3. Check the Search Views option to detect the viewed tables within the database as well as the tables linked with them.

    4. Check Exit on First Hit option to stop the scanning, when the first sensitive record in a database is detected during the detection process.

    5. Check Include Table Size checkbox to view the size of each scanned table in the Data Scanned tab under Task Instances tab of the Result screen. The information is displayed in Total Data and Sampled Data column.

    6. Check Sync Results with Privacy checkbox to push detection results on the Privacy IDP.

    7. The Incremental option is used to execute detection only on the new entries of the database i.e., after the last scan was executed. This option significantly decreases the time taken to scan the database. This option is available for the Detection task type only. To setup Incremental detection, perform the following steps: 

      1. Check the Incremental checkbox. 

      2. Select the type of incremental scanning that must be considered, i.e., addition of New Partitions, Columns/Tables to the database, or By Modified Date.

         

        1. New Partitions: Check this checkbox, if you want to scan the new partitions added in the table after the last scan was executed.

        2. New Columns/Tables: Check this checkbox, if you want to scan the new columns/tables added in the database after the last scan was executed.

        3. By Modified Date: Check this checkbox, if you want to scan only those files in the database that have its modified date later than the time when the last scan was executed.

          *Note: Following Database are supported with these incremental options:

          1. New Partitions: DB2

          2.  New Columns/Tables: DB2, Teradata, Oracle, SQL Server

          3. By Modified Date: SAS

    8. Check Run distinct query to get unique count checkbox to execute the DISTINCT queries get unique count from the system tables.

    9. The Row-level Dependent Type Detection option enables you to perform dependent type detection at row-level. For instance, you have a table as follows:

       

      Let’s consider a custom sensitive type CCNO_IF_EMAIL_ADDRESS_EXISTS. In this case, the 2nd row has a value that meets the CCNO type requirement, but the same row doesn’t have the value that meets the Email Address type requirement in the 2nd column. Thus, the 1st column named CCNO will not be declared as CCNO_IF_EMAIL_ADDRESS_EXISTS type.

    10. Check the Detect Topics option to find and report the topic of each table which has been detected during the process. This option is available for the Metadata Discovery task type. On checking the checkbox, enter the Precision and Max Columns value. 

       

      1. Precision: This option lets you specify the precision of the scanning process. You can enter multiple values in this field separated by comma.

      2. Max Columns: This option allows you to specify the number of columns which will be scanned.

  7. The Select Connections panel lists down all the available connections. Any number of connections can be selected for a scanning. To create a new connection for any data source, click the + Add New Connection button. To know more on how to create and manage connections, refer Connection Manager. Perform the below steps for selecting the connections:

     

    1. Click the Select Group drop-down and select the option to sort the available connections. It has five options: 

       

      1. Connection IDP: Categorizes the available connections based on the types of IDPs available, i.e., Detection and Masking.

      2. Connection Type: Categorizes the available connections based on the type of server connected to, i.e., Oracle, Teradata, SQL server etc.

      3. Host Name: Categorizes the list of available connections based on Host Names.

      4. Location: Categorizes the available connections based on the location of the target source system server, i.e., On-Premises and Cloud.

      5. User Name: Categorizes the list of available connections based on the Usernames.

    2. The Select Group Value drop-down displays the values based on the selection made in the Select Group drop-down. For example, if Connection IDP is selected in the Select Group field then the values displayed in the Select Group Value are Detection and Masking.

       

      Check the checkbox available with the connection name. Any number of connections can be selected for scanning. To edit a connection detail, click the Edit icon in the Actions column.

       

      The Test button lets you to test the connection before executing the task. It will show the pop-up on successful completion of testing. The Test button is enabled when you select a connection by checking the checkbox.

       

      Click the Database Object Filter button to filter tables and/or columns. Once filters are defined, then only those databases/tables/columns that match the filter are scanned. The button is enabled when you select a connection by checking the checkbox. The Database Object Filters can be added in two ways i.e., Add/Edit Filters and Upload Filter List.

       

      1. The Add/Edit Filters allows you to apply filter for the selected connection. Once filters have been defined, then only those databases/tables/columns are scanned which matched the criteria. You can define three types of filters i.e., Object Filter, Datatype Filter and Date Range Filter
                             

        1. The Object Filter let you specify filter criteria for scanning the database/tables/columns. Perform the below steps for defining an Object filter.

          1. Check the checkbox available with the connection name in Connection section. You can even search the Database or schema name in the Filter by Schema/DB Name textbox.

             

          2. Apply the Object Filter by specifying table/view operator in Select Table/View Name drop-down, table name in the Table/View Filter field, column operator in Select Column Name drop-down and column name in the Column Filter.

             

            For example, in the above screenshot, the object filter specifies that the table name must contains ‘Discovery’, and the selected table must contain SSNO column.

            There are eight types of operators based on which you can select the table and column name.

            1. Equals: This operator will check whether the given table/column name exist in the selected database/table. It will return the matched records if the condition is fulfilled.

            2. Not Equal to: This operator will return all the records except the given table/column name.

            3. Contains: This operator will return only those tables/columns which name contains the given criteria.

            4. Does not contain: The functionality of this operator is alike Not Equal to operator since it returns all the tables/columns which do not contain the given table/column name.

            5. Starts with: This operator will return all the tables/column which name starts with the given criteria.

            6. Does not start with:  The functionality of this operator is alike Does not contain and Not Equal to, since it will return all the tables/column name which does not start with the given criteria.

            7. Ends with: This operator will return all the tables/column name whose name ends with the given input.

            8. Does not end with:  The functionality of this operator is alike Does not contain and Not Equal to, since it will return all the tables/column name except the one which has been entered.

          3. Click the Add button to add the filter in the Object Filters section of the panel. To edit the details of the filter, click the Edit icon in the Actions column.

             To delete a filter, check the checkbox available with the operator and click Trash button on the top right corner. The delete button is enabled only when a filter is selected.

             

          4. The Test Results section displays the result of the filter when Test button is clicked. This lists down the result matching the filter criteria.

            To test filter, check the checkbox for the filter in the Object Filters section which you want to test. Click the Test button on the top right corner of the Test Results section.

             

          5. Click Save button to make the changes effective, else click Cancel.

        2. The Datatype Filter lets you specify the datatypes to be included or excluded while scanning based on the selection. To apply the Datatype filter, perform the below steps:

           

          1. Select the connections by checking the radio button available with the connection name in Connection section. You can even search the Database or schema name in the Filter by Schema/DB Name textbox.

             

          2. Select the Filter Type from the given option. By default, Include is selected.

             

            1. Include: Select this option to include the columns with selected datatype for scanning.

            2. Exclude: Select this option to exclude the columns with the selected datatype for scanning.

          3. The Additional Datatype field allows you to add a datatype if it’s not present in the panel where all datatypes are displayed. Enter the datatype in the field and click Add button. This will add the datatype in the panel.

             

          4. To select one or more datatypes, check the checkbox available with the datatype name which you want to include or exclude, based on the selection of Filter Type in the above panel.

            Datatypes can be searched based on the string entered in the Search textbox.

             

          5. Click Test button in the Test Results section to test whether any column in the database contains the selected datatype. This functionality will list down all the columns that contain the selected datatype.

             

          6. Click Save button to make the changes effective else, click Cancel button. 

        3. The Date Range Filters lets you specify filter criteria for scanning the database/tables/columns based on the date range specified.

          *Note: The Date Range Filter is visible only for SAS connection.

          To apply date range filters, perform the below steps:

           

          1. Select the connections by checking the radio button available with the connection name in the Connection section.

             

          2. Check the Start Date checkbox. Enter the date manually or choose it from the calendar. By default, the value in the field is set to 1/1/1970.

            Similarly, check the End Date checkbox and enter the date manually or choose it from the calendar. By default, the value in this field is set to current date.

             

          3. Click the Add button to add the defined date filters in the Date Range Filters panel. To edit the details of the filter, click the Pen icon in the Actions column.

            To delete a filter, check the checkbox available before the Connection ID column and click the Trash button on the top right corner of the panel.

             

          4. Once filters are defined, test the filter by clicking the Test button in the Test Results panel. The Test Results panel displays the results of the filters, defined in the Object and Date Range filters.

          5. Click Save button to make the changes effective else, click Cancel button.

      2. The Upload Filter List tab allows you to upload a file containing the list of all columns of a table that either needs to be included or excluded in the scanning.

        To upload a filter list, perform the below steps: 

         

        1. Select the connection from the Choose Connection drop-down. It displays only those connection which are selected in the Select Connections panel of the Add New Task Definition screen.

        2. The Download Sample button is enabled when a connection is selected in the Choose Connection drop-down. Enter the Database, Database Name/Schema Name/Directory Path, Table/View and Column Name in the sample file. The sample file must be in a CSV format. 

           

        3. Select either Inclusion or Exclusion in the Filter List Type. This functionality allows you to specify whether to include or exclude the Database, Database Name/Schema Name/Directory Path, Table/View, and Column name.

        4. Click Choose File button to search and upload the saved sample file containing the list of all Database, Table/View and Column name which will be either excluded or included while scanning.

        5. Click Upload Filter List to add the defined filters. This uploads the correct entries in the Add/Edit Filters tab whereas the incorrect entries are rejected. The incorrect entries are uploaded in the Rejected Filters List which were not in a proper format as specified in the sample file. 

           

        6. To download the list of rejected entries, click Export as CSV button. A downloaded file will contain the list of rejected entries which were not in format as per sample file.

        7. To remove the rejected entries from the Rejected Filters List section, click the Ignore Rejected Entries. This functionality will remove all the rejected entries from the panel.

        8. Click Save button to make the changes effective else, click Cancel button.

  8. The Select Policy panel displays all the Pre-Defined and Customized Policies. Users can select any number of policies while creating or editing a task. Selecting a policy is not a mandatory step, users can also proceed to select individual sensitive types. To know more, refer to section Policy. To select one or more policies, check the checkbox(s) available with the policy name(s).

     

  9. The Pre-Defined and Custom Sensitive panel displays the list of all Pre-Defined and Custom Sensitive Types associated with the selected policy in Select Policy panel. The sensitive type associated with the policy gets selected in this panel and cannot be removed once selected, however any number of sensitive types can be added to the panel.

     

  10. Click Save button to save the task. To execute the task instantly after saving, click Save and Execute button.
    To edit an existing task, select the required task from the list of tasks on the Task Definition screen and click Pen icon in the Actions column.

 There are few common controls through which you can sort the values in the column header, hide and unhide the column headers, filter the data, etc. To know more, visit RDBMS Common Controls.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.