-
Notifications
You must be signed in to change notification settings - Fork 38.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add DisruptionTarget condition when preempting for critical pod #117586
Add DisruptionTarget condition when preempting for critical pod #117586
Conversation
/test pull-kubernetes-verify |
/sig node |
/kind bug |
Do you think this could go as a bug, bypassing the KEP update phase, and backported to 1.26 & 1.27? On one hand it looks like a bug (omission of the scenario). On the other hand, because we didn't include the scenario in KEP we may need to first update KEP and then fix it for 1.28. |
21af673
to
faf6896
Compare
Added to the KEP update for 1.28: kubernetes/enhancements#3965 for Third Beta. |
/triage accepted |
FYI: the pull-kubernetes-node-kubelet-serial-pod-disruption-conditions fails for the PID-based eviction test for a reason not directly related to disruption conditons, but because currently the eviction tests fail: https://testgrid.k8s.io/sig-node-containerd#node-kubelet-containerd-eviction. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh eviction tests, my old "friends".
Looks like the new test passes at least. One small nit on the wording of the condition, otherwise LGTM
631fc3d
to
e1e3814
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
^ for the test
actual change is also a lgtm
/test pull-kubernetes-e2e-gce |
@mimowo: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/test pull-kubernetes-e2e-gce |
I wondered if we need to wait for the KEP update to be merged as this PR is an implementation of the KEP update. |
I would like to handle it as a bug and merge without the need to wait for the KEP, even if the scenario wasn't listed explicitly in the previous versions of the KEP. Anyway, I prepare the KEP update as well to make sure it aligns with the implementation (or in case we want to do the regular KEP, the implementation steps). |
lgtm. I agree that this is much smaller change and can be captured as bug rather then feature (and potentially cherrypicked if needed), since it sounds like preemption by kubelet to admit critical pod was a missed case previously. Would be great to get other folks thoughts on this. /lgtm |
LGTM label has been added. Git tree hash: 2536e1973b244c65398a45839f1793ec5997be49
|
/assign @SergeyKanzhelev @dchen1107 |
/lgtm I agreed to treat this as a bug of the pod failure policy. Please send us the backport PRs. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dchen1107, endocrimes, mimowo The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
…86-upstream-release-1.26 Automated cherry pick of #117586: Add DisruptionTarget condition when preempting for critical
…86-upstream-release-1.27 Automated cherry pick of #117586: Add DisruptionTarget condition when preempting for critical
What type of PR is this?
/kind feature
/kind bug
What this PR does / why we need it:
In order to annotate pod disruption caused by preemption initiated by Kubelet to make room for a critical pod.
An analogous scenario of preemption is covered with the DisruptionTarget condition by Kube-scheduler.
Which issue(s) this PR fixes:
Special notes for your reviewer:
The scenario of preemption by Kubelet to make room for a critical pod was overlooked during earlier phases of the development of pod failure policy, so it can be considered a bug.
The test appears stable, repeated over 100 iterations and no failure.
Fixing this issue is covered in the KEP update: kubernetes/enhancements#3965
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: