Description
What happened?
A Job is created with completions: 1, parallelism: 1. However, two pods appear a few minutes apart, both with identical ownerReferences
(name, uid, etc. all point to the same unique Job).
I don't understand what I see in the kube-controller-manager
logs, when the first pod is scheduled, I see
I0305 18:28:46.167341 1 job_controller.go:566] "enqueueing job" logger="job-controller" key="the-namesapce/the-job-name-1741199325"
That same "enqueueing job" log line repeats 9 times, most in the same second, some a bit later:
I0305 18:28:46.167341 ...
I0305 18:28:46.183597 ...
I0305 18:28:46.192648 ...
I0305 18:28:46.195377 ...
I0305 18:28:46.233094 ...
I0305 18:28:48.915103 ...
I0305 18:29:24.315840 ...
I0305 18:29:25.328100 ...
I0305 18:29:26.339424 ...
At few minutes later, with the first pod is already running, a second one appears. At this exact time the logs show a similar message:
I0305 18:31:46.236414 1 job_controller.go:566] "enqueueing job" logger="job-controller" key="the-namesapce/the-job-name-1741199325"
I0305 18:31:47.613379 1 job_controller.go:566] "enqueueing job" logger="job-controller" key="the-namesapce/the-job-name-1741199325"
E0305 18:31:50.044308 1 job_controller.go:599] syncing job: tracking status: adding uncounted pods to status: Operation cannot be fulfilled on jobs.batch "the-job-name-1741199325": the object has been modified; please apply your changes to the latest version and try again
...
I0305 18:31:51.068789 1 job_controller.go:566] "enqueueing job" logger="job-controller" .... (repeats again 7 times rapidly)
The only clue I have is this "the object has been modified", but certainly does not make any sense to me. The Job object has been created with a single "kubectl create -f job.yaml", nothing fancy in it. What could be going on?
What did you expect to happen?
Only one Pod should be scheduled.
How can we reproduce it (as minimally and precisely as possible)?
Unfortunately, this seems to happen randomly (once in hundreds of Jobs). I need help understanding what causes this, if I do, I can try to reproduce it.
Anything else we need to know?
Something similar seems to have been reported for a very old version #120790, but it's difficult to understand if it's the same.
Kubernetes version
$ kubectl version
Client Version: v1.30.4
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.30.4
Cloud provider
OS version
$ cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
NAME="Debian GNU/Linux"
VERSION_ID="12"
VERSION="12 (bookworm)"
VERSION_CODENAME=bookworm
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"
$ uname -a
Linux [redacted host name] 6.1.0-28-cloud-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.119-1 (2024-11-22) x86_64 GNU/Linux
Install tools
Container runtime (CRI) and version (if applicable)
Related plugins (CNI, CSI, ...) and versions (if applicable)
Metadata
Metadata
Assignees
Labels
Type
Projects
Status