Using the PK Protect Google Cloud IDP, user can provision a Dataproc cluster to access GCS data. The Google Cloud IDP is used for browsing the GCS buckets. The GCS IDP allows PK Protect to run data detection and protection tasks on the GCS repository.
Access the GCS Configuration screen by clicking the GCS Configuration option in the left side pane. The GCS Configuration screen is depicted below:
The top panel displays the cluster information along with its status. Clicking on the configuration populates the bottom panel with cluster details.
Perform the following steps to configure a cluster:
Click Configure New Cluster.
The Add/Edit Configuration dialog box will appear.
The fields are described below:
Cluster Type: Select either Provision or Manual.
Compute Cluster Name: Enter the name you want to provide to the cluster.
GCS Cloud IDP: Select the GCS Cloud IDP.
GCS IDP: Select the GCS IDP.
*Note: This field appears when you chose Manual option as the Cluster Type. This enables the user to add GCS cluster manually from the Admin.
Master Node Machine Type: Select the machine Type for the Master node from the dropdown.
Worker Node Machine Type: Select the machine Type for the Worker node from the dropdown.
Subnet Id: Define the sub network TCP/IP address. Specify this value when the cluster is created on a specific subnet.
Worker Node Count: Set the number of Worker nodes to be created.
Network Tags: Specify the network tags you want to associate with an instance.
Perform the following steps to provision a cluster with the given configuration:
Select the desired configuration.
Click Provision Compute Cluster.
Perform the following steps to edit a cluster:
Select the configuration you want to edit.
Click Edit Cluster.
Click Save after making the desired changes.
Perform the following steps to destroy a cluster:
Select the configuration you want to destroy.
Click Destroy Compute Cluster.