Skip to content

Fix unexpected change of container ready state when kubelet restart #123982

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

LastNight1997
Copy link
Contributor

@LastNight1997 LastNight1997 commented Mar 19, 2024

What type of PR is this?

/kind bug

What this PR does / why we need it:

pod config readinessprobe:

  1. if ready state is true and kubelet restart, pod temporarily report containerNotReady,this may lead to service not available
  2. if ready state is false and kubelet restart, pod temporarily report containerReady,this may also lead to service not available

thus, this PR keep the container ready state when kubelet restart.

Which issue(s) this PR fixes:

Fixes #100277

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/bug Categorizes issue or PR as related to a bug. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Mar 19, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @LastNight1997. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Mar 19, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: LastNight1997
Once this PR has been reviewed and has the lgtm label, please assign dchen1107 for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added area/kubelet sig/node Categorizes an issue or PR as relevant to SIG Node. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Mar 19, 2024
if container.ReadinessProbe != nil {
containersWithReadinessProbe.Insert(container.Name)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is duplicated for the initContainers below. Would it make sense to wrap it into function?

case w.manualTriggerCh <- struct{}{}:
default: // Non-blocking.
klog.InfoS("Failed to trigger a manual run", "probe", w.probeType.String())
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and call that function only where its result is used, e.g. here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can wrap it into function, but call it here is not a good idea, because it's in the for loop here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this makes sense to me. It should be called before the loop.

case w.manualTriggerCh <- struct{}{}:
default: // Non-blocking.
klog.InfoS("Failed to trigger a manual run", "probe", w.probeType.String())
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here

@HirazawaUi
Copy link
Contributor

/sig network
because sig-network wants to move it forward.

@k8s-ci-robot k8s-ci-robot added the sig/network Categorizes an issue or PR as relevant to SIG Network. label Apr 3, 2024
@bart0sh
Copy link
Contributor

bart0sh commented Apr 3, 2024

/triage accepted
/priority important-longterm

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Apr 3, 2024
@bart0sh
Copy link
Contributor

bart0sh commented Apr 3, 2024

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Apr 3, 2024
@k8s-ci-robot
Copy link
Contributor

@LastNight1997: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubernetes-e2e-kind-ipv6 10a35cb link true /test pull-kubernetes-e2e-kind-ipv6

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@HirazawaUi
Copy link
Contributor

  1. Write a small KEP detailing the change and the tests needed to prove it (I think the bug has a lot of text you can steal)

A. I think we can use a "deprecated" gate instead. So, concretely - a gate named "PodUnreadyOnKubeletRestart" which is {Default: false, PreRelease: featuregate.Deprecated}. The code in question would look like:

@thockin If we just use a "deprecated" gate, maybe we don't need a KEP, could we use the same precedent we used: #119789

@aojea
Copy link
Member

aojea commented Apr 4, 2024

@thockin If we just use a "deprecated" gate, maybe we don't need a KEP, could we use the same precedent we used: #119789

It will be good to have the KEP to describe the scenarios and behaviors and tests, so we keep this documented, the referenced issue is more scoped and this have a larger blast radius , last time we changed something in Pod readiness we broke Service and it took more than 3 releases to notice it

@thockin
Copy link
Member

thockin commented Apr 4, 2024

#119789 was a really inconsequential change - this is really not :)

@LastNight1997
Copy link
Contributor Author

LastNight1997 commented Apr 9, 2024

You can assign the KEP to me (though it needs a SIG Node reviewer/approver, too

@thockin Thank you. I have no experience writing KEP. I really appreciate it if you could do it.

@thockin
Copy link
Member

thockin commented Apr 17, 2024

@LastNight1997 LOL, what I meant was: you write the KEP and I'll review/shepherd it.

Writing a KEP basically means:

  1. open an issue here: https://github.com/kubernetes/enhancements/issues
  2. clone this dir: https://github.com/kubernetes/enhancements/tree/master/keps/NNNN-kep-template into a new directory named after the issue number from above
  3. fill in the files in this new directory
  4. open a PR and assign it to me

You should be able to breeze thru most of it since you already know the code change. The most effort will go into thinking about test and how this could fail. Try to imagine the most bizarre thing a person could do to break this change - how do we defend?

@chenk008
Copy link
Contributor

chenk008 commented Jul 11, 2024

Will this PR continue to move forward?

I have a similar change... https://github.com/chenk008/kubernetes/pull/3/files#diff-e81aa7518bebe9f4412cb375a9008b3481b19ec3e851d3187b3021ee94148f0d

@thockin
Copy link
Member

thockin commented Jul 14, 2024

@chenk008 This still needs a KEP. #123982 (comment)

@olyazavr
Copy link
Contributor

olyazavr commented Sep 6, 2024

+1 to this PR, this bug has been plaguing us and has caused noticeable pain. I was in the process of writing a similar patch for our setup when I came across this PR

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 3, 2025
@HirazawaUi
Copy link
Contributor

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 4, 2025
Copy link
Member

@tallclair tallclair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I realized I had reviewed this a while ago but never submitted the comments.

That said, looking through this again, I'm not sure this is the right approach. There's a lot of complexity here around knowing when to carry forward the old status.

What if instead we prepopulate the status cache & readiness results cache when a pod is added back? In other words, in HandlePodAdditions (kubelet.go), after the pod is admitted: if the pod status says it's running then add its status to the status manager, and add any ready containers to the readiness manager.

We need to carefully think through the implications of pre-populating the status cache though.
/cc @smarterclayton @bobbypage

ready = result == results.Success
if !ready {
m.tryTriggerWorker(pod.UID, c.Name, readiness)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't manually retrigger in the case of a failure. There was already a result, the probe failed - it should retry on the regular probing period.

Comment on lines 312 to +314
} else {
// The check whether there is a probe which hasn't run yet.
w, exists := m.getWorker(pod.UID, c.Name, readiness)
ready = !exists // no readinessProbe -> always ready
if exists {
// Trigger an immediate run of the readinessProbe to update ready state
select {
case w.manualTriggerCh <- struct{}{}:
default: // Non-blocking.
klog.InfoS("Failed to trigger a manual run", "probe", w.probeType.String())
}
if containersWithReadinessProbe.Has(c.Name) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: else if

		} else if containersWithReadinessProbe.Has(c.Name) {
			// The check whether there is a probe which hasn't run yet.

klog.InfoS("Failed to trigger a manual run", "probe", w.probeType.String())
}
if containersWithReadinessProbe.Has(c.Name) {
m.tryTriggerWorker(pod.UID, c.Name, readiness)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we should just remove the manual triggering here too. What purpose does it serve?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it was added in #98376, but I don't have the full context there. This looks like it violates the probe's InitialDelaySeconds to me.

/cc @SergeyKanzhelev

@@ -230,7 +230,9 @@ func (w *worker) doProbe(ctx context.Context) (keepGoing bool) {
w.resultsManager.Remove(w.containerID)
}
w.containerID = kubecontainer.ParseContainerID(c.ContainerID)
w.resultsManager.Set(w.containerID, w.initialValue, w.pod)
if w.probeType != readiness {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is going to reset the readiness to false if the container restarts?

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 20, 2025
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle rotten
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jun 24, 2025
@thockin thockin added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels Jun 25, 2025
@k8s-triage-robot
Copy link

The lifecycle/frozen label can not be applied to PRs.

This bot removes lifecycle/frozen from PRs because:

  • Commenting /lifecycle frozen on a PR has not worked since March 2021
  • PRs that remain open for >150 days are unlikely to be easily rebased

You can:

  • Rebase this PR and attempt to get it merged
  • Close this PR with /close

Please send feedback to sig-contributor-experience at kubernetes/community.

/remove-lifecycle frozen

@k8s-ci-robot k8s-ci-robot removed the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Jun 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubelet cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. release-note-none Denotes a PR that doesn't merit a release note. sig/network Categorizes an issue or PR as relevant to SIG Network. sig/node Categorizes an issue or PR as relevant to SIG Node. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Development

Successfully merging this pull request may close these issues.

pod config readinessprobe, if kubelet restart, pod temporarily report containerNotReady,this may lead to service not available