Google Cloud Kubernetes Engine
  1. CNS Policies
  2. GCP Knowledge Base
  3. Google Cloud Kubernetes Engine

Automatic Node Repair Disabled

Ensure that automatic node repair is enabled on all node pools in Kubernetes clusters

Risk Level: Low

Description

This plugin ensures that all Kubernetes cluster nodes have automatic repair enabled. The automatic node repair tool aids in performing periodic health checks on the nodes in your Kubernetes clusters. This contributes to the overall health and functionality of your nodes. If a node fails the health checks consecutively, GKE starts a repair operation for that node.

About the Service

Google Cloud Kubernetes Engine:

The Google Cloud Kubernetes Engine is a Kubernetes-based service that includes a control plane, nodes that house pods, and Google Cloud services. It aids in the modernization of your programmes by offering a platform for deploying, managing, and scaling containerized applications. The Google Cloud Console or kubectl can be used to interact with this Google Cloud Kubernetes Engine. To know more, read here

Impact

If the automatic node repair option for your node pools is disabled, GKE will not perform periodic health checks on the nodes, leaving unhealthy nodes in your cluster unattended. This will eventually reduce overall efficiency since services that rely on the damaged nodes will be affected.

Steps to Reproduce

Using GCP Console-

  1. Log In to your GCP Console.
  2. From the top navigation bar, select the GCP project you want to investigate.

  3. From the navigation panel on the left side of the console, go to Kubernetes Engine and select Clusters. You can use this link here to navigate directly if you’re already logged in.
  4. Select the cluster you want to investigate from the list of clusters displayed and go to the NODES tab of the selected cluster.
  5. Under the Node pools section, select the node pool you want to verify from the list of node pools displayed in the table.
  6. In the Management section, check the status of Auto-repair. If it says disabled then the automatic node repair feature is disabled for this particular node pool of the selected cluster.
  7. Repeat steps 5 and 6 for all the node pools present in the selected cluster.
  8. Repeat steps 4 to 7 for all the clusters you want to investigate in the selected project.
  9. If you have multiple projects that you want to investigate, repeat steps 2-8 for each project in your GCP console.

Steps for Remediation

Determine whether or not you truly require the automatic node repair feature to be disabled. If not, make the necessary changes to enable it using the steps given below.

Using GCP Console-

  1. Log In to your GCP Console.
  2. From the top navigation bar, select the GCP project you want to investigate.
  3. From the navigation panel on the left side of the console, go to Kubernetes Engine and select Clusters. You can use this link here to navigate directly if you’re already logged in.
  4. Select the cluster you want to reconfigure from the list of clusters displayed and go to the NODES tab of the selected cluster.  (In case you aren’t sure which node pool needs to be configured, follow the steps to reproduce listed above to determine which to choose.)
  5. Under the Node pools section, select the node pool you want to verify from the list of node pools displayed in the table.
  6. Click the EDIT button on the top navigation bar to reconfigure the settings.
  7. In the Management section, check the Enable auto-repair checkbox to enable automatic node repair.
  8. Click save to save the changes to the node pool.
  9. Repeat steps 5 and 6 for all the node pools that you want to reconfigure in the selected cluster.
  10. Repeat steps 4 to 7 for all the clusters you want to reconfigure in the selected project.
  11. If you have multiple projects, repeat steps 2-10 for each project in your GCP console.