Google Cloud Dataflow

Dataflow Jobs Encryption Not At Desired Level

Ensures that Dataflow jobs have CMEK encryption enabled.

Risk Level: Medium

Description

This plugin ensures that the Google Cloud Dataflow jobs are encrypted using Customer-Managed Encryption. CMEK gives you more control over the key operations compared to the Google-managed encryption keys. These keys can be created by the users using the Google Cloud Key Management Service. They can be used to encrypt the object’s data, the object’s CRC32C checksum, and the MD5 hash. 

About the Service

Google Cloud Dataflow:
Dataflow is particularly handy when it comes to processing and enriching large amounts of data. It assists with data collection, processing, and analysis. It's a data-processing service for stream and batch data that's serverless, quick, and cost-effective. It includes auto-scaling and also provides portability. You can use the pre-built templates, create your own or write SQL statements to develop the pipelines from BigQuery. Security is also ensured with CMEKs, private IPs, and VPCs. To know more, read here

Impact

Google-Managed Encryption Keys is the default encryption provided whenever a new Dataflow job is created. However, GMEKs offer very little flexibility and make everything is transparent to the client. CMEKs, on the other hand, allow the user to tailor the encryption to their specific requirements, resulting in greater security.

Steps to Reproduce

Using GCP Console-

  1. Log In to your GCP Console.
  2. From the top navigation bar, select the GCP project you want to investigate.
  3. From the navigation panel on the left side of the console, go to Dataflow and click on Jobs. You can use this link here to navigate directly if you’re already logged in.


  4. In the list of jobs displayed, click on the job you wish to investigate and click on the toggle button on the right corner of the page.
  5. On the Job info page, check the value of the Encryption type. If it displays Google-managed key then the selected dataflow job does not have optimal encryption
  6. Repeat steps 4 and 5 for all the Dataflow jobs you want to investigate in the selected project.
  7. If you have multiple projects, repeat steps 2 to 6 for each project in your GCP Console. 

Steps for Remediation

Determine whether or not you truly require customer-managed encryption to be disabled. If not, make the necessary changes to enable it using the steps below.
Note: It is not possible to update the encryption of an existing Dataflow job. Instead, a new cluster can be created with the same configurations to replace it. 

Using GCP Console-

  1. Log In to your GCP Console.
  2. From the top navigation bar, select your desired GCP project.
  3. To encrypt your Dataflow job using customer-managed keys, make sure that you first create a new key that can be used for this.
    NOTE: If you already have a CMEK that you wish to use, skip to step 10.
  4. From the navigation panel on the left side of the console, go to Security under the More products section and select Key management. You can click this link here to navigate directly if you’re already logged in.
  5. To create a key, you must first create a key ring. Click on the CREATE KEY RING button on the top bar. 

    NOTE: If you already have a key ring created that you wish to use, skip to step 7.
  6. In the Create key ring page, enter your desired Key ring name and select your preferred location type. Click the CREATE button to create the new key ring.
  7. Go to the newly created key ring and select the CREATE KEY button to create a new key.
  8. In the Create key page, select Generated key as the type of key you wish to create. Next, enter your preferred key name, choose your desired protection level, and select purpose as Symmetric encrypt/decrypt.
  9. Choose your required configurations for the key rotation period and click on CREATE to create the key.
  10. From the navigation panel on the left side of the console, go to Dataflow and click on Jobs. You can use this link here to navigate directly if you’re already logged in.
  11. Select the job you want to recreate from the list of jobs available and note down all its configuration settings. (In case you aren’t sure which Dataflow job needs to be configured, follow the steps to reproduce listed above to determine which to choose.)
  12. Go back to the Jobs page and click on either of the CREATE buttons. 
  13. Fill in a unique name for the new job and the rest of the configurations according to the original job.
  14. Under Encryption, select the Use a customer-managed encryption key (CMEK) option.
  15. From the drop-down dox available to select a key, select your desired key. If no valid keys are found, click on can’t see your key? Enter key resource name to enter your key resource name.


  16. In the Enter key resource name pop-up box, enter your desired key resource in the specified format and click SAVE.

    Note: To find the resource name of the key, go to the navigation panel on the left side of the console and click to Security under the All products section, and select Key management. Select your desired key ring and from the list of keys in that particular keyring, click the actions button (three-dot icon) and select the copy resource name option.
  17. Configure the rest of the settings based on the original dataflow job and click RUN JOB to create the new job.
  18. Repeat steps 3 to 19 for all the Dataflow jobs you want to reconfigure in the selected project.
  19. If you have multiple projects, repeat steps 2 to 20 for each project in your GCP console.