Skip to content

scheduler: stop clearing NominatedNodeName on all cases #132439

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

utam0k
Copy link
Member

@utam0k utam0k commented Jun 21, 2025

What type of PR is this?

/kind feature
/sig scheduling
/cc sanposhiho

What this PR does / why we need it:

ref: #132384

Which issue(s) this PR is related to:

Fixes #132384

Special notes for your reviewer:

Does this PR introduce a user-facing change?

The scheduler no longer clears the `nominatedNodeName` field for Pods. External components (such as Cluster Autoscaler and Karpenter) are responsible for managing this field when needed.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot requested a review from sanposhiho June 21, 2025 04:14
@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jun 21, 2025
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the needs-priority Indicates a PR lacks a `priority/foo` label and requires one. label Jun 21, 2025
@utam0k
Copy link
Member Author

utam0k commented Jun 21, 2025

/test pull-kubernetes-unit

@utam0k utam0k force-pushed the not-to-clear-nnn branch from a06ef74 to f53ac40 Compare June 21, 2025 08:22
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jun 21, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: utam0k
Once this PR has been reviewed and has the lgtm label, please assign sanposhiho for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added area/test sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Jun 21, 2025
@utam0k
Copy link
Member Author

utam0k commented Jun 21, 2025

/test pull-kubernetes-e2e-kind

@utam0k utam0k force-pushed the not-to-clear-nnn branch from f53ac40 to ad9c039 Compare June 22, 2025 00:50
@lmktfy
Copy link

lmktfy commented Jun 22, 2025

Is this relevant to #132443?

@lmktfy
Copy link

lmktfy commented Jun 22, 2025

Changelog suggestion

-The scheduler no longer clears the NominatedNodeName field for pods. External components (like Cluster Autoscaler and Karpenter) are responsible for managing this field when needed.
+The scheduler no longer clears the `nominatedNodeName` field for Pods. External components (such as Cluster Autoscaler and Karpenter) are responsible for managing this field when needed.

However, see #132443 (comment)

We should align the two changelog entries.

@utam0k
Copy link
Member Author

utam0k commented Jun 23, 2025

Is this relevant to #132443?

Yes, it is. I've updated the release note.

@sanposhiho
Copy link
Member

/cc @macsko @dom4ha

This is part of nnn kep

@k8s-ci-robot k8s-ci-robot requested review from dom4ha and macsko June 23, 2025 22:11
@@ -365,7 +363,7 @@ func (sched *Scheduler) handleBindingCycleError(
}
}

sched.FailureHandler(ctx, fwk, podInfo, status, clearNominatedNode, start)
sched.FailureHandler(ctx, fwk, podInfo, status, nil, start)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we make these changes under a feature flag or we are okay with leaving them unguarded?

@sanposhiho What is the strategy for using the feature gate in this KEP?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we should guard a new logic and we (including my PRs) forgot doing that 😅 (ref).
We already talked about it in DM and will update the PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

KEP-5278: Change the scheduler not to clear NNN
5 participants