Skip to content

scheduler_perf: wait for all pods to be scheduled before deletion in churnOp recreate mode #132167

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

utam0k
Copy link
Member

@utam0k utam0k commented Jun 7, 2025

What type of PR is this?

/kind feature
/sig scheduling

What this PR does / why we need it:

Fixes #125974

Which issue(s) this PR is related to:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

scheduler_perf: churnOp in recreate mode now waits for all pods to be scheduled before starting deletion phase, ensuring consistent churn behavior and preventing unscheduled pods from being stuck in pending state

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jun 7, 2025
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the needs-priority Indicates a PR lacks a `priority/foo` label and requires one. label Jun 7, 2025
@k8s-ci-robot k8s-ci-robot requested review from AxeZhan and denkensk June 7, 2025 12:26
@k8s-ci-robot k8s-ci-robot added area/test sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Jun 7, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: utam0k
Once this PR has been reviewed and has the lgtm label, please assign ahg-g for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@utam0k
Copy link
Member Author

utam0k commented Jun 7, 2025

/cc @macsko

@k8s-ci-robot k8s-ci-robot requested a review from macsko June 7, 2025 12:26
churnFns = append(churnFns, func(name string) string {
if name != "" {
if err := dynRes.Delete(e.tCtx, name, metav1.DeleteOptions{}); err != nil && !errors.Is(err, context.Canceled) {
e.tCtx.Errorf("op %d: unable to delete %v: %v", opIndex, name, err)
shouldDelete := true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with this approach will appear when a second bunch of pods will start to be created. Then, we would end up with conflict on creating a pod in a place of a pod that wasn't deleted.

Maybe you could try waiting for all pods to be scheduled just after all of them were created, before deleting any?

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jun 29, 2025
@utam0k utam0k changed the title scheduler_perf: only delete scheduled pods in churnOp recreate mode scheduler_perf: wait for all pods to be scheduled before deletion in churnOp recreate mode Jun 29, 2025
@k8s-ci-robot
Copy link
Contributor

@utam0k: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubernetes-integration d3bfd9f link true /test pull-kubernetes-integration

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@@ -1868,7 +1921,29 @@ func (e *WorkloadExecutor) runChurnOp(opIndex int, op *churnOp) error {
retVals[i] = make([]string, op.Number)
}

count := 0
// Create all resources first
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing this, we will wait only for the first bunch of pods. Then, after we create a second bunch, we will proceed with deletion without waiting.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@macsko
Thank you for pointing out this issue! You're right that the current implementation only waits for the first batch of pods; subsequent batches proceed without waiting.

I've been thinking about how to properly address this while maintaining the intended behavior of churnOp, and here are a few approaches I'm considering:

Option 1: Keep it simple — remove the waiting logic entirely.

  • Rely only on the scheduled check before deletion.
  • Unscheduled pods will remain until they are scheduled.
    Pros: Simple, consistent behavior throughout the test
    Cons: Some pods might accumulate in the pending state.

Option 2: Periodic synchronization - Wait every N cycles

  // Wait when starting a new cycle of pods
  if hasPods && count % op.Number == 0 && count > op.Number {
      waitForScheduledPodsInNamespace(...)
  }
  • Pros: Ensures fair treatment of all pod generations
  • Cons: Adds complexity and periodic pauses

Option 3: Dynamic Throttling — Adjusts the churn rate based on the scheduled pod ratio.
Only proceed with deletion when X% of pods are scheduled.

  • Self-adjusts to scheduler performance
    Pros: Adaptive to system state
    Cons: More complex implementation

I'm leaning towards option 2, which removes the wait entirely. This option keeps the implementation simple while still addressing the original issue, #125974, by preventing the deletion of unscheduled pods.

What's your preference? Are there any other approaches that would better align with the goals of scheduler_perf testing?

Copy link
Member

@macsko macsko Jul 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we do:

		churnFns = append(churnFns, func(name string) string {
			if name != "" {
				// New: Wait for `name` pod to be scheduled here, before deleting
 				if isPod(name) {
 					waitForPod(name)
 				}
				if err := dynRes.Delete(e.tCtx, name, metav1.DeleteOptions{}); err != nil && !errors.Is(err, context.Canceled) {
					e.tCtx.Errorf("op %d: unable to delete %v: %v", opIndex, name, err)
				}
				return ""
			}

			live, err := dynRes.Create(e.tCtx, unstructuredObj, metav1.CreateOptions{})
			if err != nil {
				return ""
			}
			return live.GetName()
		})

We could also make this waiting before deletion mechanism under boolean flag to allow to use the old way as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make churnOp in scheduler_perf more useful for recreating the pods
3 participants