Skip to main content

Text Files

To create a text file structure, enter the following:

  1. Enter the number of rows in the Number of Header Rows drop-down. This is a mandatory field. The maximum value for this field is 9.

  2. Select the structure type from the Text Structure Type drop-down. This field is mandatory. There are four values as shown below:

*Note: The fields in this screen changes as per the type of structure selected in the Text Structure Type drop-down.

  1. Check the Use Delimiter/Qualifier checkbox if you do not want to set the value of Position Counter, Offset and Length of the column name. On checking this checkbox, the Offset and Length field will hide in the Assign Sensitive Data Type panel. This option is visible when you select Default or HiveStruct in the Text Structure Type drop-down.

  2. Enter the delimiter for the column in the Column Delimiter field for the selected Text Structure Type. For instance, comma separated, pipe, etc. This is a mandatory field. This field is disabled when you uncheck the Use Delimiter/Qualifier option. This option is visible when you select the Default or HiveStruct in the Text Structure Type drop-down.

  3. Enter the qualifier for the column in the Qualifier field for the selected Text Structure Type. For example, comma (,), dollar ($), double quotes (“), cap (^), etc. This field is optional and visible only when you select Default or HiveStruct in the Text Structure Type drop-down. It gets disabled when you uncheck the Use Delimiter/Qualifier option.

    For instance, if your CSV file is comma delimited and a data element contains a comma, then the qualifier helps preserve the data by considering it a single unit and not using the comma in the data element as a separator.

  4. Select the value in the Position Counter drop-down. This is a mandatory field when you uncheck the Use Delimiter/Qualifier option in the screen. This drop-down has one value i.e., Character. This field is visible when you select the Default option in the Text Structure Type drop-down.

  5. Enter the pattern of the file in the File Pattern field. This field is used for structure identification. This is a mandatory field. If you have different format's like docx, csv, txt files in the directory but you specified ‘.txt’ in this field than only .txt files will be treated as structured file and all remaining file types will be treated as unstructured.

  6. Enter the delimiter in the Array Delimiter field. This field is visible when you select HiveArray in the Text Structure Type drop-down. This is a mandatory field.

  7. Select the sensitive type from the Sensitive Data Type drop-down. This field displays the list of all pre-defined and user defined Sensitive Data Types. This field is visible when you select HiveArray or HiveMap in the Text Structure Type drop-down.

  8. Enter the value for the Key-Value Delimiter field. The key-value delimiter separates the set of key-value pairs. This field is visible when you select HiveMap in the Text Structure Type drop-down.

  9. Enter the value for the Element Delimiter field. This field is visible when you select HiveMap in the Text Structure Type drop-down.

  10. Select a value for the sensitive data type for key and value in the Sensitive Data Type for Key and Sensitive Data Type for Value. Both these drop-downs display the list of all the pre-defined and user defined Sensitive Data Types. This field is visible only when you select the HiveMap in the Text Structure Type drop-down.

  11. Check the Add Key checkbox if you want to add the Key Name and the Sensitive Data Type for Key & Value. This field is visible only when the HiveMap is selected in the Text Structure Type drop-down.

    To add multiple keys, click the Add button. To remove any key, click the Remove button.

  12. The Assign Sensitive Data Type panel allows you to browse and select the file or object that needs to be defined before applying masking. This panel is visible only when you select the Text Structure Type as Default or HiveStruct. To add a column, there are two ways of defining it.

    1.  Manually – To add a column detail manually, perform the following steps: 

      1. Enter the name of the column. This is a mandatory field. 

      2. Enter the number of the column. This lets you define the position of the column in the RC/ORC file. This is a mandatory field. 

      3. Select the Sensitive Data Type from the drop-down. This displays the list of all pre-defined and custom defined Sensitive Data Types. 

      4. Click the Add button. This will add the details of the column in the list panel.


        To delete any of the record from the panel, check the checkbox(s) available with the column name and click Delete. The Delete button is enabled only when a record has been selected. You can also delete an individual record by clicking the Trash icon available under the Actions column.


        Similarly, to edit the details of the column, click the Pen icon under the Actions column.

         

        To filter or search the structure from the given list, click the Search field. This displays the list of headers based on which filter can be applied.


        To remove all the filters, click the x Clear Filters option. To remove individual filters, click the Close button next to the applied filter.

    2. Import the File Structure – You can also import the file structure by clicking the Browse File option.

      Follow the below steps to import a file structure:

      1. Click the Browse File button.

      2. This opens the side window which displays the list of all the objects or files in the selected cluster for the selected module. 


        This screen is divided into two panels:

        1. The top panel display the information for the selected module and cluster. To change the module, click the Select Module drop-down. This will list all the modules of the application. There are five modules i.e., File, Hadoop, AWS, Azure, and GCS. Similarly, if you want to change the cluster, click the Select Cluster drop-down.

          In the same panel, you can also view the IDP status for each selected module.

        2. The bottom panel i.e., File Browser displays the list of all the directories and objects for the selected module and cluster. This panel is further divided into two panels:

          *Note: The browser name changes based on the module selected in the Select Module drop-down.

          1. The left panel displays the list of all the directories associated to the selected cluster. You can also search the directory name by typing the name in the field provided just before the Expand button. The Expand button is enabled only when you search data in the textbox.


          2. The right panel displays the list of all the files and folders which are stored in the selected directory. To select any file or object, click the specific file name.


            To refresh this panel with the updated information, click the Refresh button on the top right corner of the panel.


            If Is Recursive feature is enabled, you can search any file in the parent directory and its sub directories. However, if this feature is disabled, search will happen only in the parent directory.


            On selecting the object, click the Select button. This redirects you to the new panel in which you need to enter the number of rows which will be used for sampling and select the show sampling rows option.

  13. Provide a numeric value in the Header Rows field. The value in this field specifies the row number where the column headers are defined in the file. This is a mandatory field. The maximum allowed value is 9.

  14. Enter the number of I in the Rows To Sample field. Before importing the data, this field samples the data based on the value specified in the field.

  15. The Show Sample Rows field allows user to view the sample values stored in the columns for the selected file. If the value is set to True, it displays the sample values for each column that is being imported. Setting the value to False, will not display the sample values for that structure.


    For example, in the below screenshot the Header Rows value will be set to 5 as the column headers are defined in the fifth row.

  16. Now, click the Import Columns button. This will import all the columns of the selected file or object in the Filtered Columns panel.

  17. Select the sensitive type from the Sensitive Data Type drop-down for each column details that need to be added in the structure. The Sensitive Data Type drop-down displays the list of all the Pre-defined and User Defined sensitive types.

    To add filtered column details, check the checkbox(s) for each column details.

  18. Click the Save button. This adds the selected column details in the Assign Sensitive Data Type panel.

  19. Click the Save button. This will save the structure details. The details are displayed on the Structure List screen. Click the Cancel button if you do not want to save the structure details.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.