Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip terminal Pods with a deletion timestamp from the Daemonset sync #118716

Merged
merged 3 commits into from
Jun 27, 2023

Conversation

alculquicondor
Copy link
Member

@alculquicondor alculquicondor commented Jun 16, 2023

What type of PR is this?

/kind bug

What this PR does / why we need it:

/sig apps

If surge is disabled, the Daemonset currently waits for Pods to fully disappear from the apiserver before creating replacements. However, since k8s 1.22, Pods with a terminal phase (Failed, Succeeded) are guaranteed to not be holding any resources from a Node, and are safe to replace.

By skipping terminal Pods with a deletionTimestamp, the DS controller is able to create replacement Pods in the following scenarios:

Which issue(s) this PR fixes:

Fixes #118587

Special notes for your reviewer:

Does this PR introduce a user-facing change?

The Daemonset controller creates replacements for terminal Pods, which can appear during VM preemptions or when using Pod finalizers

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. sig/apps Categorizes an issue or PR as relevant to SIG Apps. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jun 16, 2023
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. area/test sig/testing Categorizes an issue or PR as relevant to SIG Testing. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jun 19, 2023
Change-Id: I64a347a87c02ee2bd48be10e6fff380c8c81f742
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 19, 2023
@alculquicondor alculquicondor changed the title WIP Skip terminal Pods with a deletion timestamp from the Daemonset sync Skip terminal Pods with a deletion timestamp from the Daemonset sync Jun 19, 2023
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 19, 2023
@alculquicondor
Copy link
Member Author

/assign @bobbypage
/sig node

@k8s-ci-robot k8s-ci-robot added the sig/node Categorizes an issue or PR as relevant to SIG Node. label Jun 19, 2023
test/integration/util/util.go Show resolved Hide resolved
@@ -2757,18 +2757,25 @@ func TestGetNodesToDaemonPods(t *testing.T) {
if err != nil {
t.Fatal(err)
}
addNodes(manager.nodeStore, 0, 2, nil)
addNodes(manager.nodeStore, 0, 4, nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not seem needed since all pods in the test reference node-0 or node-1 only.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leftover from a previous approach. Reverted.

pkg/controller/daemon/daemon_controller_test.go Outdated Show resolved Hide resolved
@bart0sh bart0sh added this to Triage in SIG Node PR Triage Jun 20, 2023
@soltysh
Copy link
Contributor

soltysh commented Jun 22, 2023

/assign @atiratree

@mimowo
Copy link
Contributor

mimowo commented Jun 23, 2023

/lgtm

Change-Id: I8b921157e6be1c809dd59f8035ec259ea4d96301
@atiratree
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 26, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 9ffda0efded5aa9801bcc7a05c8f9a333d033eec

@soltysh
Copy link
Contributor

soltysh commented Jun 27, 2023

/label tide/merge-method-squash
/priority backlog
/triage accepted

@k8s-ci-robot k8s-ci-robot added priority/backlog Higher priority than priority/awaiting-more-evidence. triage/accepted Indicates an issue or PR is ready to be actively worked on. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. and removed needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jun 27, 2023
Copy link
Contributor

@soltysh soltysh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alculquicondor, soltysh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 27, 2023
@k8s-ci-robot k8s-ci-robot merged commit a451966 into kubernetes:master Jun 27, 2023
12 of 13 checks passed
SIG Node PR Triage automation moved this from Needs Approver to Done Jun 27, 2023
@k8s-ci-robot k8s-ci-robot added this to the v1.28 milestone Jun 27, 2023
@alculquicondor
Copy link
Member Author

Cherry-picks:

1.27: #118911
1.26: #118912

I don't plan to cherry-pick below this version, given that it would depend on #118912, which we decided to only cherry-pick to 1.26

k8s-ci-robot pushed a commit that referenced this pull request Jul 7, 2023
* Skip terminal Pods with a deletion timestamp from the Daemonset sync

Change-Id: I64a347a87c02ee2bd48be10e6fff380c8c81f742

* Review comments and fix integration test

Change-Id: I3eb5ec62bce8b4b150726a1e9b2b517c4e993713

* Include deleted terminal pods in history

Change-Id: I8b921157e6be1c809dd59f8035ec259ea4d96301

* Exclude terminal pods from Daemonset e2e tests

Change-Id: Ic29ca1739ebdc54822d1751fcd56a99c628021c4
k8s-ci-robot pushed a commit that referenced this pull request Jul 7, 2023
* Skip terminal Pods with a deletion timestamp from the Daemonset sync

Change-Id: I64a347a87c02ee2bd48be10e6fff380c8c81f742

* Review comments and fix integration test

Change-Id: I3eb5ec62bce8b4b150726a1e9b2b517c4e993713

* Include deleted terminal pods in history

Change-Id: I8b921157e6be1c809dd59f8035ec259ea4d96301

* Exclude terminal pods from Daemonset e2e tests

Change-Id: Ic29ca1739ebdc54822d1751fcd56a99c628021c4
rayowang pushed a commit to rayowang/kubernetes that referenced this pull request Feb 9, 2024
…ubernetes#118716)

* Skip terminal Pods with a deletion timestamp from the Daemonset sync

Change-Id: I64a347a87c02ee2bd48be10e6fff380c8c81f742

* Review comments and fix integration test

Change-Id: I3eb5ec62bce8b4b150726a1e9b2b517c4e993713

* Include deleted terminal pods in history

Change-Id: I8b921157e6be1c809dd59f8035ec259ea4d96301
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/backlog Higher priority than priority/awaiting-more-evidence. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

DaemonSet does not recreate failed pods when they have deletionTimestamp
6 participants