Troubleshooting Kubernetes Pods: Stuck in a "Pending" State

If your pods are stuck in pending state, you may need to do some quick troubleshooting. Here are the steps to follow to get your Pods back up and running.

Patrick Londa
Author
Jan 19, 2022
 • 
5
 min read
Share this post

So you’re using Kubernetes to manage your containerized services, but you’ve run into a snag. Your project isn’t loading, and the Pods are stuck in a pending state. Fortunately, Kubernetes has helpful debugging tools that can readily streamline the troubleshooting process. Use this step-by-step guide to troubleshoot Kubernetes Pods stuck in a pending state.

What Does “Pending” Mean?

Kubernetes Pods are left pending if they can’t be scheduled to a node. The “kubectl describe pods“ command should display messages from the scheduler explaining why your pod can’t be scheduled to a node.

How Does a Pod Become Stuck in a “Pending” State?

There are two common reasons for a pod to fail to be scheduled to a node. First, it may be bound to hostPort. Second, you may have insufficient resources (usually memory or CPU).

Blink Automation: Troubleshoot Pods Stuck in "Pending" State
Blink + Kubernetes
Try This Automation

Manually Troubleshooting Pods Stuck in “Pending”

Now that you understand more about “stuck Pods”, follow these steps to manually troubleshoot a Kubernetes pod stuck in a pending state.

Step 1: Diagnosing the Issue

The first step in any kind of Kubernetes troubleshooting is to run the command: 

kubectl describe pods

This command will return a basic description of each of your Pods, including their state. In the output, you’ll also be able to see if you have reached CPU, memory, or network limits. This is one of the most likely reasons for a pod remaining in the “pending” state.

2. Scale out, scale up

If you have reached resource limits, then you can increase capacity by scaling out or scaling up.

You scale out by adding more worker nodes to the cluster. You can do this in a variety of ways depending on which cloud infrastructure you are using. As a starting point, here is a basic kubernetes guide on how to add nodes to an existing cluster.

To scale up, you instead need to increase the node memory or CPU on your existing nodes.

3. Reduce Your Resource Requests

If you don’t want to add capacity by scaling out or scaling up, another option is to reduce your existing resource requests. You can make this change by editing the following configuration arguments in your manifest YAML file.

  • spec.containers[].resources.requests.cpu
  • spec.containers[].resources.requests.memory
  • spec.containers[].resources.requests.hugepages-<size>

After you apply these changes, you will have reduced the resources needed on deployment. 

Another option that has a similar effect is to remove unneeded deployments and resources to free up space. Cleaning up your resources is a good regular practice regardless of running into errors like this, since it can reduce costs.

Troubleshoot Faster with Blink:

You might have solved your problem quickly or ended up down a research rabbit hole. With Blink, you can manage your Kubernetes troubleshooting in one place with the common error causes listed and your next steps just a click away.

Blink Automation: Troubleshoot a Kubernetes Pod
Blink Automation: Troubleshoot a Kubernetes Pod

This automation above is in the Blink library. When the automation runs, gets the key details you need to troubleshoot a given Pod in a namespace.

It does the following steps:

  1. Gets the Pod status.
  2. Gets the Pod details.
  3. Gets the container logs.
  4. Gets events related to the Pod.

With one automation, you skip the kubectl commands and get the information you need to correct the error.

Get started with Blink and troubleshoot Kubernetes errors faster today.

Automate your security operations everywhere.

Blink is secure, decentralized, and cloud-native. 
Get modern cloud and security operations today.

Get a Demo