Skip to content

Some pods get stuck in ContainerCreating status after being replaced by another pod #112009

Closed as not planned
@margach

Description

@margach

What happened?

Using helm, I upgraded a deployment and the k8s started replacing the pods and while it replaced all the pods successfully, some of the pods that were replaced got stuck in ContainerCreating status:

NAME                              READY              STATUS.                    
demo-labxpert-1                    0/1                ContainerCreating    
demo-labxpert-2                    1/1                Running                     

demo-labxpert-2 has replaced demo-labxpert-1 in the upgrade but its not removed by the system.

Now, if you are using helm or CI and you have set the --wait flag, k8s never responds with a success but waits to the timeout to expire.

When i run kubectl describe pods/demo-labxpert-1, this is the output.

Normal   Scheduled           4m17s  default-scheduler        Successfully assigned staging/demo-labxpert-5d8969f84c-88hdt to pool-q6gk1dn2v-7mhnz
  Warning  FailedAttachVolume  4m16s  attachdetach-controller  Multi-Attach error for volume "pvc-bbacb80b-d2d0-4d36-a472-0faa799a150c" Volume is already used by pod(s) demo-labxpert-cccc78bc9-nwlmj
  Warning  FailedMount         2m14s  kubelet                  Unable to attach or mount volumes: unmounted volumes=[demo-labxpert], unattached volumes=[demo-labxpert kube-api-access-bwpc2]: timed out waiting for the condition

What did you expect to happen?

The pod being replaced is removed from the system and it does not get stuck in ContainerCreating status.

How can we reproduce it (as minimally and precisely as possible)?

  1. Deploy Helm chart
  2. Helm Upgrade Deployment and change the image tag.

Anything else we need to know?

I am using the Digital Ocean managed kubernetes.

Kubernetes version

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.5", GitCommit:"5c99e2ac2ff9a3c549d9ca665e7bc05a3e18f07e", GitTreeState:"clean", BuildDate:"2021-12-16T08:38:33Z", GoVersion:"go1.16.12", Compiler:"gc", Platform:"darwin/arm64"}
Server Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.12", GitCommit:"b058e1760c79f46a834ba59bd7a3486ecf28237d", GitTreeState:"clean", BuildDate:"2022-07-13T14:53:39Z", GoVersion:"go1.16.15", Compiler:"gc", Platform:"linux/amd64"}

Cloud provider

Digital Ocean

OS version

# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here

# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugCategorizes issue or PR as related to a bug.lifecycle/rottenDenotes an issue or PR that has aged beyond stale and will be auto-closed.needs-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.sig/nodeCategorizes an issue or PR as relevant to SIG Node.triage/needs-informationIndicates an issue needs more information in order to work on it.

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions