To create a RC/ORC file structure, enter the following:
Enter the number of rows in the Number of Header Rows drop-down. This is a mandatory field. The maximum value for this field must be set to 1.
Select the type of structure in the Structure Type drop-down. This field displays two values i.e., RC (row columnar) and ORC (optimized row columnar).
Enter the pattern of the file in the File Pattern field. This field is used for structure identification. This is a mandatory field. If you have different format's like .rc or .orc files in the directory but you specified ‘.rc’ in this field than only .rc files will be treated as structured file and all remaining file types will be treated as unstructured.
The Assign Sensitive Data Type panel allow you to browse and select the columns that need to be defined before applying masking. To add a column, there are two ways of defining it:
Manually – To enter a column name manually, perform the following steps:
Enter the number of the column. This lets you define the position of the column in the RC/ORC file. This is a mandatory field.
Select the Sensitive Data Type from the drop-down. This displays the list of all the Pre-defined and User defined Sensitive Data Types.
Select the name of the complex structure from the Complex Structure drop-down. You can either select the Sensitive Data Type or the Complex Structure from the drop-down.
Click the Add button. This will add the details of the column in the list panel.
To delete all the added columns details from the panel, check the checkbox(s) available with the Column Number header and click Delete. The Delete button is enabled only when a record has been selected. You can also delete an individual record by clicking the Delete icon available under the Actions column next to each column detail.
Similarly, to edit the details of the column. Click the Pen icon under the Actions column.
To filter or search the structure from the given list, click the Search field. This displays the list of headers based on which filter can be applied.
To remove all the filters, click the x Clear Filters button. To remove individual filters, click the Close button next to the applied filter.
Click the Save button. This will save the structure details. The details are displayed on Structure List screen. Click Cancel button if you do not want to save the changes.
Import File Structure – You can also define the structure by importing the data. To define a structure, perform the following steps:
Click the Browse file option.
This opens the side window which displays the list of all the objects in a cluster for the selected module. This screen is divided into two panels:
The top panel display the information for the selected module and cluster. To change the module, click the Select Module drop-down. This will list all the modules of the applications.
There are five modules i.e., File, Hadoop, AWS, Azure, and GCS. Similarly, if you want to change the cluster, click the Select Cluster drop-down.
In the same panel, you can also view the IDP status for each selected module.
The bottom panel i.e., Hadoop File Browser displays the list of all the directories and the objects for the selected module and cluster. This screen is divided into two panels:
The left panel display the list of all the directories associated to a selected cluster. You can also search the directory name by typing the name in the field provided with the Expand button. The Expand button is enabled only when you search data in the textbox.
The right panel displays the list of all the files and the folders which are stored in the selected directory. To select any file or object, click the specific file name.
To refresh this panel with the updated information, click the Refresh button on the top right corner of the panel. If Is Recursive feature is enabled, you can search any file in the parent directory and its sub directories. However, if this feature is disabled, search will happen only in the parent directory.
You can also configure the columns in this panel. Using this option, you can re-arrange the columns as per your requirement. To know more, visit Common Controls.
On selecting the object, click the Select button. This redirects you to the new panel in which you need to enter the number of rows which will be used for sampling and select the show sampling rows option.
Enter the number of rows to sample in the Rows To Sample field. Before importing the data, this field samples the data based on the value specified in the field.
The Show Sample Rows field allows user to view the sample values stored in the columns for the selected file. If the value is set to True, it displays the sample values for each column that is being imported. Setting the value to False, will not display the sample values for that structure.
Now, click the Import Columns button. This will import all the columns of the selected file or object in the Filtered Columns panel.
Select the sensitive type from the Sensitive Data Type drop-down for each column details that needs to be added in the structure. The Sensitive Data Type drop-down displays the list of all Pre-defined and User Defined sensitive types.
To add a filtered column detail, check the checkbox(es) for each column details.
Click the Save button. This will add the selected column details in the Assign Sensitive Data Type panel.
Click the Save button. This will save the structure details. The details are displayed on Structure List screen. Click the Cancel button if you do not want to save the changes.