Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proceed with deletion of rejected pods after Kubelet restart #118497

Merged
merged 2 commits into from
Jun 23, 2023

Conversation

mimowo
Copy link
Contributor

@mimowo mimowo commented Jun 6, 2023

What type of PR is this?

/kind bug
/kind regression

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #118472

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Fix 1.27 regression with deletion of non-admissible pods that are deleted during Kubelet restart

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. kind/regression Categorizes issue or PR as related to a regression from a prior release. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jun 6, 2023
@k8s-ci-robot k8s-ci-robot added area/kubelet sig/node Categorizes an issue or PR as relevant to SIG Node. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jun 6, 2023
@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jun 6, 2023
@mimowo
Copy link
Contributor Author

mimowo commented Jun 6, 2023

/test pull-kubernetes-e2e-kind-ipv6

@bart0sh bart0sh added this to Triage in SIG Node PR Triage Jun 6, 2023
@bart0sh
Copy link
Contributor

bart0sh commented Jun 6, 2023

/triage accepted
/priority important-soon
/assign @bobbypage

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jun 6, 2023
@bart0sh bart0sh moved this from Triage to Needs Reviewer in SIG Node PR Triage Jun 6, 2023
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. area/test and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jun 6, 2023
@mimowo
Copy link
Contributor Author

mimowo commented Jun 22, 2023

/test pull-kubernetes-e2e-gce-serial
/test pull-kubernetes-cos-cgroupv2-containerd-node-e2e-serial
/test pull-kubernetes-node-kubelet-serial-containerd

@mimowo
Copy link
Contributor Author

mimowo commented Jun 22, 2023

@bobbypage @smarterclayton ready for another pass. I've tested the latest version with 50 repeats of the new tests locally and there were no failures.

@mimowo
Copy link
Contributor Author

mimowo commented Jun 22, 2023

/test pull-kubernetes-e2e-kind-ipv6

@mimowo
Copy link
Contributor Author

mimowo commented Jun 22, 2023

/test pull-kubernetes-node-kubelet-serial-containerd

1 similar comment
@mimowo
Copy link
Contributor Author

mimowo commented Jun 22, 2023

/test pull-kubernetes-node-kubelet-serial-containerd

@k8s-ci-robot
Copy link
Contributor

@mimowo: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubernetes-node-kubelet-serial-containerd 17013d3 link false /test pull-kubernetes-node-kubelet-serial-containerd

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@smarterclayton
Copy link
Contributor

Looks good to me. I don't have any substantitive comments on the tests, other than "you saw them fail if your fix is commented out"?

@mimowo
Copy link
Contributor Author

mimowo commented Jun 22, 2023

Looks good to me. I don't have any substantitive comments on the tests, other than "you saw them fail if your fix is commented out"?

The first test, for the main scenario, fails consistently without the fix.

The second, for the non-admissible runtime pod, passes as it was already covered by HandlePodCleanups, but it seems valuable to test this non-obvious modification of the main scenario.

@bobbypage
Copy link
Member

bobbypage commented Jun 22, 2023

Also lgtm on the changes, the change to use pod worker update with SyncPodKill is much cleaner then having to call into TerminatePod from within HandlePodCleanups.

The only thing I saw when testing out the changes here is there is a ~10s delay for the pod from the pod being rejected at kubelet to the pod being deleted on the apiserver. This is because the following flow happens:

(1) The pod is rejected by kubelet at admission and the local status manager is updated with terminal phase
(2) We wait 10s until the next syncBatch completes in status_manager
(3) status manager sees that the local phase status is different then api status, it updates phase to failed
(4) HandlePodCleanups will run, pod is filtered as part of filterTerminalPodsToDelete and the SyncPodKill will be delivered to it.

IMO, the 10s latency is not big deal... it will reconcile eventually, but if we wanted it to update faster and reduce latency, I think if we also called kl.podWorkers.UpdatePod(UpdatePodOptions with UpdateType: kubetypes.SyncPodKill in rejectPod it would update the status ~instantly, without waiting for the 10sec sync batch (i.e. as https://github.com/kubernetes/kubernetes/pull/118614/files).

So maybe long term, we should do both, i.e. call UpdatePod in HandlePodCleanups and rejectPod for these types of pods that are rejected at admission and don't exist in pod worker. But I don't think should be blocking the PR (and can be handled as a followup) as the case here is not super common to hit in the first place, and the latency is not critical IMO.

@bobbypage
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 22, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 9cb115ee3ae7a5f372aa9f07838b0b6508db666c

@mimowo
Copy link
Contributor Author

mimowo commented Jun 23, 2023

Thanks @bobbypage for the summary. The scenario is very unlikely, so I don't think 10s makes a difference, but it is a judgment call.

I have looked if there are other scenarios possible to arrive with pods which satisfy the conditions: not runtime pods, terminal, with deletion timestamp and not in pod workers, but the scenario reported is the only I could come up with. This is probably because rejectPod is the only path in HandlePodAdditions which doesn't register the pod in pod workers.

So, this is why the fix https://github.com/kubernetes/kubernetes/pull/118614/files (to call SyncPodKIll from rejectPod( should also work. As for which one is better, the margins are small. The other fix reduces latency, but is specific to rejected pods, so potentially there might be a scenario we couldn't think about, which somehow bypasses HandlePodAdditions, but I couldn't find despite attempts.

IIUC if we combine the two fixes, then the code in HandlePodCleanups is dead, so I would prefer to decide on one of the two approaches (unless we have a scenario to support the need for both). If we are fine with latency then the generic one is preferred for me, otherwise the fix on rejectPod.

Let me know what you think, I'm out next week and will be back on the 5th of July.

@bobbypage
Copy link
Member

bobbypage commented Jun 23, 2023

Thanks @mimowo for the explanation, +1, I agree this scenario is unlikely to be hit and the 10s latency is not critical, so I'm lgtm to go with this approach.

@smarterclayton
Copy link
Contributor

Since we plan on pulling admission closer to pod worker anyway, rejection probably becomes a pod worker responsibility and that should fix the issue (but we still need resync for restarts anyway).

/lgtm too
/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mimowo, smarterclayton

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 23, 2023
@k8s-ci-robot k8s-ci-robot merged commit e81c24d into kubernetes:master Jun 23, 2023
15 of 16 checks passed
SIG Node CI/Test Board automation moved this from Archive-it to Done Jun 23, 2023
SIG Node PR Triage automation moved this from Needs Reviewer to Done Jun 23, 2023
@k8s-ci-robot k8s-ci-robot added this to the v1.28 milestone Jun 23, 2023
k8s-ci-robot added a commit that referenced this pull request Jun 29, 2023
…18497-upstream-release-1.27

Automated cherry pick of #118497: Fix the deletion of rejected pods
@mimowo mimowo deleted the fix-kubelet-regression branch November 29, 2023 15:01
@ccieliu
Copy link

ccieliu commented Jan 31, 2024

Hi Team,

Same issue here, may i know the issue fixed in which release?

@mimowo
Copy link
Contributor Author

mimowo commented Feb 1, 2024

Same issue here, may i know the issue fixed in which release?

This was fixed in 1.28 (this PR), and cherry-picked on 1.27.4 in this PR #118841.

See in release notes: https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.27.md#changelog-since-v1273

"Fix deletion of non-admissible pods that are deleted during Kubelet restart (#118841, @bobbypage) [SIG Node and Testing]"

@ccieliu
Copy link

ccieliu commented Feb 1, 2024

Same issue here, may i know the issue fixed in which release?

This was fixed in 1.28 (this PR), and cherry-picked on 1.27.4 in this PR #118841.

See in release notes: https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.27.md#changelog-since-v1273

"Fix deletion of non-admissible pods that are deleted during Kubelet restart (#118841, @bobbypage) [SIG Node and Testing]"

Got it, thanks for your time and have a good day.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubelet area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. kind/regression Categorizes issue or PR as related to a regression from a prior release. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Archived in project
Archived in project
Development

Successfully merging this pull request may close these issues.

Pod graceful deletion hangs if the kubelet already rejected that pod at admission time
8 participants